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(57) Abstract 

DEN1-S275/90 (ECACC V920421 1 1) is a new strain of Dengue virus serotype 1. The complete cDNA sequence of this vi- 
rus has been cloned and protein-coding fragments thereof have been used in the construction of expression plasmids. 
DEN1-S275/90 in inactivated form, DEN1-S275/90 polypeptides or fusion proteins thereof can be incorporated into vaccines 
for immunisation against DEN1-S275/90 and other DEN1 viruses. The invention further provides diagnostic reagents e.g. la- 
belled antibodies to DEN1-S275/90 proteins, and kits to detect DEN1 virus. 
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CDNA SEQUENCE OF DENGUE VIRUS SEROTYPE 1 (SINGAPORE STRAIN) 

The present invention relates to Dengue Virus Type 1. 
Dengue virus infection may lead to dengue fever (DF) or its 
5 more severe dengue haemorrhagic fever (DHF) and dengue 

shock syndrome (DSS) . DHF is an important virus disease of 
global significance, especially in Southeast Asia. There 
are four serotypes of Dengue virus (DENi, DEN2, DEN3 and 
DEN4) belonging to the family Flaviviradae . 

10 The complete genomic sequence of DEN2 (Jamaica) has 

been published by Deubel et al; Virology 165, 234-244 
(1988). The complete genomic sequence of DEN 3 (H87) has 
been published by Osatomi and Sumiyoshi; Virology 176, 643- 
647 (1990) . The complete genomic sequence of DEN4 has been 

15 published by Zhao et al; Virology 155, 77-88, To date, 

only a partial sequence of any variant of DENI, DENl (Nauru 
Island) , has been determined; Mason et al, Virology 161 . 
262-267 (1987). 

We have now identified a previously unknown strain of 

20 DENl and established its complete nucleotide sequence. The 
new strain, DEN1-S275/90, was deposited at the European 
Collection of Animal Cell Cultures (ECACC) Porton Down, GB 
under Budapest Treaty conditions on 21 April 1992 and given 
accession number V92042111. DEN1-S275/90 differs 

25 significantly from DEN2, DEN 3 and DEN 4 in terms of sequence 
homology. There are also a number of significant 
differences between DEN1-S275/90 and DENl (Nauru Island) . 

The present invention thus provides DEN1-S275/90 
(ECACC V92042111) . The invention further provides DENl- 

30 S275/90 (ECACC V92042111) for use as a diagnostic reagent. 
The invention also provides DEN1-S275/90 in inactivated 
form for use as a diagnostic reagent or a vaccine. 

The invention also provides the nucleic acid sequence 
of Seq. ID No. 1 and DNA sequences substantially 

35 corresponding to SEQ ID No. 1, e.g. degenerate variants 
thereof having one or more nucleotide changes but 
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nevertheless capable of being translated to give the same 
protein sequence . The invention further provides fragments 
of such DNA polynucleotides, in particular the fragments 
encoding the C, C',PreM, H f E, NS1, NS2A,' NS2B r NS3, NS4A, 
5 NS4B and NS5 genes of the genome of the virus* The start 
and end points of these preferred fragments in the nucleic 
acid sequence of Seq I.D. No, 1 are shown below in Table 1« 
Table 1 also shows the start and end points of the proteins 
encoded by these genes, using the numbering of Seq, ID Nos. 
10 1 and 2. 

TABLE 1 

Start and end points of the nucleic acid (n) numbers 
enuodiny the genes of S275/9Q-. The tablet. also shows the 
15 start and end points of the corresponding proteins (p) 
within the polyprotein encoded by S275/90. 



Gene 


Start (n) 


Endfn) 


Start fp) 


Endfp) 


C 


81 


422 


1 


114 


C 


123 


422 


15 


114 


PreM 


423 


695 


115 


205 


M 


696 


920 


206 


280 


E 


921 


2402 


281 


774 


NS1 


2403 


3464 


775 


1128 


NS2A 


3465 


4112 


1129 


1344 


NS2B 


4113 


4499 


1345 


1474 


NS3 


4500 


6359 


1475 


2093 


NS4A 


6360 


6809 


2094 


2242 


NS4B 


6810 


7556 


2243 


2492 


NS5 


7557 


10268 


2493 


3396 



The nucleic acid sequences of the invention may be 
used as probes in an assay to determine the presence or 
absence of DEN1-S275/9Q, or they may be incorporated into a 
vector, eg* an expression vector. 
35 Nucleic acid fragments according to the invention may 

be made by known methods of chemical synthesis or cloned 
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from the virus itself using known recombinant techniques. 
Fragments according to the invention may also be produced 
by replication of DNA or RNA, by transcription from DNA to 
form RNA fragments or reverse transcription from RNA 
5 fragments to form DNA fragments. Such transcription may be 
in a cell free system or may be effected in cells for 
instance by cloning. Cell free systems include an 
appropriate replicase, transcriptase or reverse 
transcriptase, suitable nucleotide precursors and a nucleic 

10 acid template or appropriate sequence, together with 
buffers and any necessary or desirable cof actors. 

The present invention also provides a polyprotein as 
set forth in Seq. ID No. l and Seq. ID No. 2 and fragments 
thereof, eg. the C, c, PreH, H, E, NSI, NS2A, NS2B, NS3, 

15 NS4A, NS4B and NS5 proteins as identified above in Table 1. 
The invention thus provides a polypeptide having an amino 
acid sequence substantially corresponding to the sequence 
shown in SEQ ID No. 2 or a fragment* thereof . Fusion 
proteins which incorporate these peptides are also 

20 provided. 

The polyprotein and proteins according to the 
invention may be produced by synthetic peptide chemistry or 
by expressing vectors carrying DNA encoding the proteins in 
a suitable cell in order to produce expression of the DNA, 

25 followed by recovery of the expressed protein. Methods of 
expressing and recovering recombinant proteins, including 
fusion proteins, are well known in the art. 

For example, for expression of a polypeptide of the 
invention, an expression vector may be constructed. An 

30 expression vector is prepared which comprises a DNA 

sequence encoding a polypeptide of the invention and which 
is capable of expressing the polypeptide when provided with 
a suitable host, eucaryotic or procaryotic. Appropriate 
transcriptional and translational control elements are 

35 provided, including a promoter for the DNA sequence, a 
transcriptional termination site, and translational start 



WO 93/22440 



PCT/CA93/00182 



- 4 _ . 

and stop codons » The DNA sequence Is provided in the 
correct frame such as to enable expression of the 
polypeptide to occur in a host compatible with the vector. 
The expression vector may be selected to be suitable to 
5 express the nucleic acid sequences of the invention in, for 
example, a bacterial e.g. E^ coli, yeast, insect or 
mammalian cell. A baculovirus expression system may be 
used. The nucleic acid may be expressed in order that a 
protein or peptide encoded by the fragment alone is 

10 produced or alternatively it may: be expressed to provide a 
fusion protein in which DEN1-S275/90 or a protein thereof, 
e.g* E, NS1, NS2, NS3 or NS5 as identified in Table 1 above 
is fused to a second amino acid sequence, e.g. a C-terminal 
sequence derived fresi glutathione transferase or maltose 

15 binding protein or a C-terminal or N-terminal signal 

sequence. Such a sequence may for example cause the fusion 
protein to be exported from the cell. The expression 
vector is then provided with an appropriate host. Cells 
harbouring the vector are grown so as to enable expression 

20 to occur- The vector may be a plasmid or a viral vector. 

Recovery and where desirable, further purification of 
the protein produced by an expression vector in a host cell 
may be by means known in the art. Such means are designed 
to separate the protein of the invention from the other 

25 proteins of the host cell. Suitable means include 
chromatographic separation of the recovered protein. 

The polyprotein and peptides of the invention may be 
used as immunogens for a vaccine against DEN1-S275/90 and 
other DENT viruses. Suitably, the proteins and peptides of 

30 the invention will be combined with a pharmaceutical^ 

acceptable carrier or diluent in order to prepare a sterile 
vaccine composition. The vaccine composition may then be 
used in a method of immunizing a human against DEN1 
infections. 

35 Advantageously, a vaccine composition against DEN1 may 

comprise a mixture of two or more peptides. For example, 
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it may comprise one non-structural (NS) peptide, eg. NS1 or 
NS3, together with a capsid (C) , M or E peptide. A mixture 
of two or more NS peptides could also be used. 

The proteins and peptides of the invention may also be 
5 used as antigens in an immunoassay to detect the presence 
or absence of DEN1, and especially DEN1-S272/90. The 
proteins and peptides are optionally labelled with a 
detectable label, eg a radioisotope, biotin or a 
f luorophore. The immunoassay may be conducted by bringing 

10 a known quantity of labelled protein (antigen) into contact 
with a sample suspected of containing antibody against DEN1 
and detecting the presence or absence of antibody-antigen 
complex containing the labelled antigen. 

The invention also provides antibodies against the 

15 above-mentioned proteins and peptides of the invention. 

The antibodies may be monoclonal or polyclonal. Monoclonal 
antibodies may be produced by hybridoma techniques known in 
the art or by recombinant means to provide hybrid 
antibodies such as humanized antibodies. 

20 The antibodies of the invention may be used in a 

method of treatment, eg passive immunisation, of DEN1 
infections. The antibodies may also be used in a method of 
diagnosis, eg by immunoassay, to detect the presence or 
absence of DEN1 in a sample. The antibodies may be 

25 labelled as described above for the proteins and peptides 
of the invention. They may also be labelled with a toxin 
or isotope selected to kill virus-infected cells. 
Antibodies against NS1 are particularly favoured since NS1 
is expressed on the surface of Dengue virus-infected cells. 

30 The antibodies of the invention may also be used in a 

method to detect the presence or absence of DEN1 protein in 
a sample. The method may comprise bringing the antibody 
into contact with a sample suspected to contain DEN1 
proteins (antigens) and detecting the amount of antibody- 

35 antigen complex formed. Immunoassays according to the 

invention may be, for example, competitive (eg radioimmune 
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assays - RIA) or non-competitive (eg enzyme linked 
immunosorbent assays - ELISA) . 

The following Examples illustrate the invention • In 
the accompanying drawings: 
5 Figure 1 is a diagrammatic representation of the cDNA of 
Dengue virus Type V (Singapore strain S275/90) and 
fragments of said DNA in expression vectors; 
Figure 2 shows gel results confirming serologic responses 
in mice after immunisations with fusion proteins prepared 
10 as in Examples 2-5 with or without complete Freund's 
adjuvant (CPA) * 

Gel Lanes: Lane it M, Lane 2: anti E, Lane 3: anti-E+ 
CFA, Lane 4: ant i-NSl, Lane 5: anti-NS2, 
Lane 6: anti-NS2+ CFA r Lane 7; anti-NS3, 
15 Lane 8: anti-NS3+ CFA, Lane 9: anti-NS5, 

Lane 10: anti-NS5+ CFA, Lane 11: positive 
rabbit sera , Lane 12: negative rabbit sera, 
Lane 13: M; 

Figure 3 shows gel results confirming serologic response in 
20 rabbits after immunisations with fusion proteins prepared 
as in Examples 2 to 5, (-) , serum before immunisation; (+) 
serum after immunisation. 

Gel Lanes: Lane 1: (-) , Lane 2: (+) anti-E, Lane 3: 

(-}, Lane 4: {+) anti-NSl, Lane 5: (-) , Lane 
25 6: (+) anti-NS2, Lane 7: (-) , Lane 8: (+) 

anti-NS3, Lane 9: (-) , Lane 10: (+) anti- 
NS5, Lane 11: positive Dengue, Lane 12: 
patient sera; 

30 Figure 4 shows fluorescence microscopy of C6/36 cells 
infected with Dengue Type 1 DI-275 and probed with 
antibodies against recombinant fusion proteins. A, control 
antiserum; B, anti-E; C, anti-NSl; D, anti-NS2; E, anti- 
NS3; F anti-NS5. 
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EXAMPLE 1 

DEN1 virus, strain S275/90, was isolated in 1990 from 
the serum of a DHF patient in Singapore by 3 passages in 
AP61 (Aedes psuedoscutellaris) cells followed by 3 passages 
5 in C6/36 (Aedes albopictus) cells, and identified by 
immunofluorescence using type-specific monoclonal 
antibodies. After a further 8 to 13 passages in C6/36 
cells, the virus-infected culture fluid was partially 
purified by precipitation with polyethylene glycol and 
10 ultracentrifugation on a 30% sucrose cushion (6). The 
viral RNA was extracted from the purified virus by 
treatment with phenol in the presence of sodium dodecyl 
sulphate. Following cDNA synthesis (cDNA Synthesis System 
Plus, Amersham) using random primers, the assorted cDNAs 
15 were cloned into 2?coRI sites of P UC18 vector via BcoRI 
adaptors (Promega) . The Esherichia coli transf ormants 
containing Dengue-specific sequences were screened by 
colony hybridisation with M P-labelled cDNA probes prepared 
by reverse transcription of strain S275/90 RNA. The 
cloning procedure yielded overlapping cDNA clones 
containing inserts ranging in size from o.5kb to 2.7kb. 
The ends of these primary clones and their subclones 
obtained by nested deletional analysis (Erase-a-Base 
System, Promega) were subjected to double-strand sequencing 
(Sequenase Version 2.0, United States Biochemical). The 
sequence data generated covers about 90% of the genomic 
sequence of S275/90. 

Potential secondary structures have been postulated 
for the 5' and 3' ends of flaviviruses (4, 7, 8), posing a 

30 problem in obtaining clones with intact ends, a different 
stragegy for sequencing the 5< and 3' noncoding regions was 
used to increase the chances of obtaining clones which 
contain these sequences as well as the terminal end 
sequences of the genome. cDNAs of strain S275/90 were 

35 obtained by random priming and oligo(dT) priming (after 
poly (A) tailing of the virus RNA]; these were amplified by 



20 



25 
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polymerase chain reaction (PGR) in the presence of 
specific primers, 796 arid 10090/B, respectively. The cDNAs 
of interest were then religated into pUCi8 vector. The 
nucleotide sequences of the primers are as follows: primer 
5 796 , 5' CCG TGA ATC CTG GGT GTC 3'; primer 10090/B, 5' GGG 
AAT TCC AGT GGT GTG GATC 3' with a BaniEl site at its 5' 
end. The sequences of the primers were selected from that 
of the initial clones of strain S275/90. To obtain the 
sequences at the 5' noncoding region, random cDNA clones 
10 were first generated as described above, followed by 

ligation to £coRI adaptors before insertion into the EcoRI 
sites of the pUCIS vector. These ligated products of 
assorted cDNA inserts were flanked by the reverse and 
forward "sequencing primers of Mi3 in the pUCIS vector. The 
15 forward sequencing primer was thus used as one of the 
primers for PCR. The ligated cDNA clones were used as 
templates for PCR in the presence of primer 796 (which 
binds to the plus strand of the template at nucleotide 
position 808 to 825 of strain S275/90) and the commercial 
20 M13 single-strand primer (5'GTA AAA CGA CGG CCAGT 3', 
Pharmacia). The amplified cDNAs thus contained the 
polylinker from the pUCIS vector at one end and an Xbal 
site (at nucleotide position 728) at the other end. For 
the 3' noncoding region , an additional step was included 
25 before cDNA synthesis. After extraction, the purified 

Dengue viral RNA was tailed by poly A polymerase (Bethesda 
Research Laboratories) with ATP. This was followed by cDNA 
synthesis using oligo(dT) as primer for the first strand 
cDNA synthesis. The same procedures of EcoRI adaptors 
30 ligation and insertion into £coRI sites of the pUC18 vector 
were repeated. The ligated products were again subjected 
to PCR amplification using the primer 10090/B (which binds 
to the minus strand of the template at nucleotide positions 
10,086 to 10,099 of strain S275/90) and the commercial M13 
35 single-strand primer. 

All samples were amplified by 30 cycles of PCR with 
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melting, annealing and polymerisation conditions of 1 
minute at 94 °C, 2 minutes at 55 °C and 3 minutes at 72 °C, 
respectively. The amplified DNA was purified by 
electroelution in agarose gel followed by appropriate 
5 restriction enzyme digestions. The PGR amplified cDNAs at 
the 5' noncoding region were double digested with XJbal and 
.EcoRI, while those at the 3' noncoding region were digested 
with BamHI and J5?coIR before cloning into the appropriate 
sites of the pUC18 vector. The clones were screened and 

10 subjected to double-strand sequencing as described above. 

The sequence data obtained from the overlapping cDNA 
clones was ordered by homology alignment with the published 
sequences of the four Dengue serotypes DEN1, DEN2 , DEN3 and 
DEN4 using the computer program of Wilbur and Lipman (9). 

15 Seq ID No. 1 shows the complete nucleotide sequence of 

strain S275/90, which is 10,718 nucleotides in length, and 
its deduced amino acid sequence. t4ie reading frame begins 
with the first AUG start codon, corresponding to 
nucleotides 81 to 83, and contains an open reading frame of 

20 10,188 nucleotides encoding a polyprotein of 3396 amino 
acids; there are 80 nucleotides in the 5' noncoding region 
and 450 nucleotides in the 3' noncoding region. The 
sequence in the 5' noncoding region preceding the first AUG 
codon of the open reading frame appears to be conserved for 

25 all Dengue virus types (1-4). The length of the 3* 

noncoding region of strain S275/90 is longer than that of 
DEN2 (412 nucleotides), DEN3 (433 nucleotides) and DEN4 
(384 nucleotides) . 

The nucleotide composition of strain S275/90 is 31.9% 

30 A, 25.9% G, 21.5% T and 20.7% C. As reported for the other 
f laviviruses, the same purine-rich composition was 
observed, and there is an absence of poly (A) tract at the 
3' end. 

The individual protein coding segments are based on 
35 comparison with protein sequence data for all the proteins 
determined from the four Dengue serotypes. These cleavage 
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sites may reveal the involvement of viral or cellular 
proteases involved in protein processing. The c, preM, M, 
E, NS1, NS2A, NS2B, NS3, NS4A, NS4B and NS5 proteins are 
cleaved at the sites M/MNQRKK, A/FXL, RXKR/SV, X/MRCXG, 
5 VQA/DXGCV, VXA/GXG, X/SWPLN, KXQR/XG, GRX/S, VXA/NE and 
R/G, respectively, where X refers to any residue. The 
cleavage sites of NS2A r NS3 and NS4B conform to the 
reported consensus sequences (4 f 5), which were originally 
established by Rice et al (10)- 

10 The nucleotide sequences of the structural and 

nonstructural regions (5' noncoding end to NS1, about 2400 
nucleotides in length) of Nauru Island strain of DEN1 
(isolated in 1974) and strain S275/90 were compared. 
Nucleotide variation shows that transitions are about 85.0% 

15 [transitions/ (transitions + transversions) x 100%] in the 
structural region and 92.1% in the NS1 region; 15% of these 
base changes are transversions in the structural region and 
7.9% in the NS1 region. The overall 236 nucleotide 
differences have given rise to 27 amino acid substitutions. 

20 As shown in Table 2, the nucleotide homology is 93.1% and 
when translated, the amino acid homology is 97,6%. 
Although both strains were isolated from different 
geographic regions with an interval of 16 years , a higher 
homology was still observed between the two strains. It 

25 can also be seen in Table 2 that strain S275/90 shows a 
higher homology with DEN3 then with DEN2 and DEN4. The 
nucleotide divergence of each gene is less than the 
translated amino acid divergence. The greatest nucleotide 
and amino acid changes, and hence the greatest evolution, 

30 lie in the nonstructural gene NS2A in all the four Dengue 
serotypes. A high homology is found in NS3 and NS5, which 
contain conserved sequences. 

Chu et al (11) compared three topotypes of DEN1 
strains (Thailand, Philippines and Caribbean) genetically 

35 at the envelope region. They found nucleotide changes to 
be less than 5% but translational differences of 2% at the 
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amino acid level, our strain S275/90 shows nucleotide 
changes of 7.7% and amino acid changes of 2.6% in the 
envelope region. Rico-Hesse (6) compared nucleotide 
sequences within a chosen E/NSl region to estimate 
5 evolutionary relationships among 40 DENi strains of 
different geographic range and time period. 

TABLE 2 



HOMOLOGY tk\ COMPARISON OF ALIGNED NUCLEOTIDE SEQUENCES OF TOE 
FOUR DENGUE SEROTYPES WITH STRA IN 6275/90 (AMINO ACTQ ALIGNMENT 
WITHIN BRACKETS) ~ : 



5275/90 


DENI 


DEN2 


DEN3 


DEN4 


Full 
length 


93.1 ( 97 fi^ 


A"7 i fin q\ 

o / • j. {/u.yj 


70.4 (75.5) 


65.1 (67.6) 


5' non- 
coding 


100 


81.7 


93.8 


87.7 


C 


97 ..4 (98.2) 


70.5 (67.5) 


80.5 (80.7) 


68.1 (67.9) 1 


PrM 


91.6 (95.6) 


71.1 (75.8) 


75.8 (78.0) 


68.0 (68.1) 


M 


93.3 (98.7) 


64.0 (70.7) 


70.3 (78.7) 


60.7 (60.3) 


£ 


92.3 (97.4) 


65.4 (67.7) 


69.0 (76.4) 


64.8 (61.8) 


NS1 


92.6 (98.0) 


70.1 (73.6) 


74.5 (78.7) 


70.1 (68.8) 


NS2A 




55.1 (39.0) 


57.0 (46.8) 


51.7 (37.9) 


NS2B 




66.0 (60.8) 


69.4 (69.2) 


63.1 (60.8) 


NS3 




72.0 (79.3) 


74.0 (84.5) 


69.9 (75.4) 


NS4A 




63.0 (61.4) 


69.6 (68.7) 


62.8 (58.7) 


NS4B 




69.7 (76.7) 


74.9 (82.3) 


71.1 (75.9) 


NS5 




71.7 (78.7) 


73.8 (81.0) 


69.9 (72.9) 


3 ' non- 
coding 




83.8 


87.4 


79.5 
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EXAMPLE 1 

CONSTRDCT IQN OF EXPRESSION PLASMIDS 

Standard recombinant DNA techniques were used for 
5 construction of the expression plasmids described below and 
summarised in Pig. l (Sambrook et al.. Molecular Cloning: a 
laboratory manual. Cold Spring Harbor Laboratory Press 
N.Y.). 

For construction of plasmids, the cDNA regions for E, 
10 MSI, NS2, NS3 and NS5 of clone DI-275, a DENl cDNA clone 
derived from DENl virus Singapore Strain S275/90 as in 
Example 1, were amplified by the polymerase chain reaction 
(PCR) and digested with restriction enzymes. The 
restriction enzyme sites were built into the 
15 oligonucleotide primers used in the PCR as set out in Table 
3 and Seq ID Nos. 3-12. 

Fragments of E, NS3 and NS5 cDNA digested with 
restriction enzymes were ligated to the pGEX-KG vector 
(Guan and Dixon, Anal. Biochem. 192, 262-267, 1991). 
20 Fragments of NS1 and NS2 cDNA were ligated to pMAL-c and 
pMAL-cRI vectors (New England Biolabs) , respectively (Ford 
et al., Prot. Exp. Pur. 2, 96-107, 1991; Maina et al., Gene 
74, 365-373, 1988; di Guan et al., Gene, 67, 21-30, 1991). 
The construction of NS5 cDNA was done in two stages. The 
25 5'-region, the cDNA fragment from nucleotide 7544-8365 of 
NS5, was made by PCR, digested with Sail and Clal; and the 
3 '-region, the fragment from nucleotide 8275 (Clal) to the 
3 '-end of NS5, was isolated directly from the cDNA of clone 
DI-275 (D-275 cDNA) by Clal and Sad double digestion. The 
30 two. parts of NS5 were ligated together, then ligated into 
the pGEX-KG vector. Recombinant plasmids were transformed 
into E. coli DH5a or ceoo HFi strains. All plasmids 
encoded Dengue virus proteins fused to the C-terminus of 
glutathione S-transferase or Maltose Binding Protein (MBP) . 

35 
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EXAMPLE 3 

PURIFICATION OF E, NS3 AND N55 PROTEINS FROM RECOMBINANT 
E. COLI 

5 E. coli, harbouring E, NS3 and NS5 genes (separately) 

were grown in LB medium A m of 0.5 at 37 °C, then induced 
with IPTG at 0»2mM for 2 h at 30°C. The bacteria were 
harvested and resuspended on ice in MTPBS buffer (0.15 M 
NaCl, 0.016 M Na 2 HP0 4r Q.005 M NaH 2 P0 4 ) with 0.1 mg/ml 
10 lysozyme, 1% triton X-100, 0.5 ng/ml aprotinin, 0.05 M9/ml 
Leupeptin, 0.25 /ig/ial pepstatin, 5mM DTT and 0.175 /jg/ml 
PMSF, and kept on ice for 10 min. The cells were sonicated 

was centrifuged at 12,000 g. The supernatant was added 
15 to 1 ml Glutathione-Sepharose 4B beads (Pharmacia) , and 

incubated at 4 °C on a rotator for 1 h to absorb the fusion 
proteins. Then the beads were centrifuged and washed with 
PBS buffer (by centrifugation} at least 6 times, or until 
the wash solution read zero at A 2m in a spectrophotometer. 
20 The beads were resuspended in thrombin cleavage buffer^ and 
the Dengue virus proteins were cleaved off the beads with 
thrombin at 4 a C for 1 hr. The supernatant, containing 
Dengue virus proteins, was recovered by centrifugation, and 
the proteins were stored at -80 *C. 

25 

EXAMPLE 4 

SOLUBILISATION AND PURIFICATION OF A FUSION PROTEIN OF NS1 

FROM INCLUSION BODIES 

JET. coli containing the NS1 fusion protein was grown as 
30 above, except the tac promoters were induced with 0.3mM 

IPTG for 16 h. The bacteria were harvested, 1 gram wet 

weight of E. coli was resuspended in 5 ml lysis buffer with 

lysozyme at 1.6 mg/ml and was sonicated for 2 x 15 sec. 

After centrifugation at 1000 x g the supernatant was again 
35 centrifuged (25,000 x g) . The pellet was resuspended in 2 

ml H 2 0 / adding a final concentration of 0.5% Triton X-1X)0, 



! 
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10 mM EDTA, and 100 mM NaCl, then centrifuged at 20,000 x g 
twice. The pellet was washed with 1 ml 2 M urea twice and 
dissolved in 8 M urea in 0.1 M Tris-HCl pH 8.8, 0.14 M 2- 
mercaptoethanol . The urea concentration was reduced to 1 M 
5 by adding H 2 0, and amylose resin (New England Biolabs) was 
added to adsorb the solubilised fusion protein at 22 «C for 
1 h. The amylose resin was washed with buffer (New England 
Biolabs) five times until the Ajjo of the clarified 
supernatant was near zero. A final concentration of 50 mM 
10 maltose was then added to elute the fusion protein, which 
was recovered by removing the beads by centrifugation. 

EXAMPLE S 

PURIFICATION OF A SOLUBLE FUSION P ROTEIN OF WR> 
15 After growth of E. coli transformed with 

PMAL-CRI/NS2-1, lysis and sonicati^n as in Example 3 above, 
the clarified extract containing ttie soluble NS2 fusion 
protein was adsorbed onto amylose resin, followed by 
washing and elution of the NS2 fusion protein as in Example 
20 .4 above. 



EXAMPLE 6 

IMMUNISAT ION OF RABBITS AND MTffE- 

The soluble fusion proteins of E, NS2, NS3 and NS5 
purified from recombinant E. coli, as in Examples 3 and 5 
above, and inclusion bodies containing the NS1 fusion 
protein which had been purified up to the 2M urea wash 
stage as in Example 4 , were placed directly in SDS loading 
buffer for preparative SDS-PAGE in 10% SDS-polyacrylamide 
gels. The proteins were visualised by staining with 0.05% 
Coomassie Blue for 10 min. The gel segments were cut and 
homogenized in sterile PBS, mixed with Freund's adjuvant 
and injected directly into white rabbits intramuscularly 
and subcutaneously on the first, sixth and twenty first 
days with about 200-500 (ig of fusion protein per injected 
dose. The rabbits were bled 14 days after the last booster 
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dose* For immunisation of mice, 12-day old female Swiss 
mice were immunised with the soluble proteins of E, NS1, 
NS2, NS3 and NS5 fusion proteins with or without Freund's 
adjuvant. The injections were intraperitoneal or 
5 subcutaneous on the first, fourth, and fourteenth day, 

using about 20 /ig fusion protein per dose. The mice were 
bled 14 days after the last dose. The sera of rabbits and 
mice were used for IFA and immunoprecipitation assays. 

10 EXAMPLE 7 

RADIQIMMUNOPRECIPITATIONS 

Radioimmunoprecipitations were done with rabbit and 
mouse antibodies against the structural and non-structural 
Dengue virus recombinant fusion proteins of D-275. At 36- 

15 40 h post-infection of C6/36 cells with Dengue virus 

S275/90 strain, cell culture medium was replaced with 
- methionine-free medium containing 3 /ig /ml actinomycin D for 
3 h, followed by the addition of fresh medium with [ 35 SJ 
methionine at 20 jiCi/ml and 3 jig/ml actinomycin D for a 

20 further 3 h. The cells were washed with cold PBS, 

dissolved in RIPA buffer [100 mM Tris-HCl pH7.5, 150 mM 
NaCl, 10 mM EDTA, 0.1% SDS, 0.1% NP 40, 1% sodium 
dexoycholate, 100 fiq/nl PMSF] on ice for 1 h, then 
clarified, at 1000 x g for 10 min. The lysates were 

25 precleared with normal serum and protein A Sepharose. For 
immunoprecipitation, rabbit and mouse sera that had been 
preabsorbed with normal, uninfected C6/ 3 6 cell extract 
fixed by cold acetone were incubated with labeled antigen 
overnight at 4°C The virus protein-antibody complexes 

30 were precipitated with protein A-Sepharose and were washed 
with immunoprecipitation buffer [10 mM Tris-HCl, pH7.4, 
0.05% aprotinin, 1% NP40, 2 mM EDTA, 0.15 M NaCl] , 6 times 
then 2X SDS-PAGE buffer was added, boiled for 2 min, and 
the supernatant was loaded on a 12% SDS-polyacrylamide gel 

35 After fixing enhancing and drying, the gel was exposed to 
X-ray film. The results confirmed that antibodies to 
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recombinant E, NS1, NS2, NS3 and NS5 had been generated in 
mice (Fig. 2) and in rabbits (Fig. 3). These antibodies 
reacted with the native E, NS1, NS2, NS3 and NS5 proteins 
synthesised in infected C6/3 6 cells. 

5 

EXAMPLE 8 

INDIRECT IMMUNOFLUORESCENCE ASSAY 

The C6/36 cells infected with Dengue virus S275/90 for 
2 days were fixed on glass plates with cold acetone for 

10 immunofluorescence. 2-fold dilutions of the sera of 

rabbits or mice were incubated with the fixed cells for 1 h 
at 37 °C, then washed with PBS. Secondary antibodies were 
linked to fluorescein and incubated for 1 h, followed by 
washing with PBS for observation using fluorescence 

15 microscopy. Fig 4 shows the antisera to E, NSl, NS2 , NS3 
and NS5 reacted specifically with the Dengue virus S275/90 
infected cells, but control antiserum did no react. 
Quantitation of the result (as set out in Table 4) showed 
that an immune response to all recombinant Dengue virus 

20 proteins (E, NSl, NS2, NS3 and NS5) occurred in both mice 
and rabbits. 
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TABLE 3 

Oligonucleotides used to prepare cDNA fragments 
corresponding to Dengue virus proteins (bv PCR\ 
1. pGEX-KG/EX-20 
DIF920E icoRI - E 

5 'CCA TGA" ATT CCC^ATG CGA TGC GTG GGA 
DIF2400X Xhol £ 

5'CAC ATC^rCG AGT CCG CTT GAA CCA TGA 



10 2. pMAL-c/NSl-104 

DIR2400S Smalj- NS1 

5/ TGG TTC CCG GGG ACT CGG GAT GTG TA 



15 



DIF3458H .Hindlll NS1 

C 9 H «m Artim mom a,1L _ _ . 

^ «wx 1.H- Wl' V^VX latUA liAli ACC ATT 



GA 



3. pMAL-cRI/NS2-l 
DIR-NS2PM_EcoRI NS2 

5'AAT CAG^AAT TCT CTG CA^ GGT CAG GGG AA 
DIF-NS2H HindlH _JJS2 
20 5 'ATA ACA AAG CTT A^C TTT GTT TCT TTT TCT 

4. pGEX-KG/NS3 BHC6001 
DIR-NS3B ^BamHI NS3 

5 'GAA AGG ATC CTC TGG AGT GTT ATG GGA CAC A 
25 DIF-6360H Jiindlll JJS3 

5'ACC CAA GCT TCA TCT TCT TCC TGC TGC 



5. pGEX-KG/NS5(C600 HF1) 
DIR-75445 Sail. NS5 
30 5 'AGG AGG^TCG ACG A^GG TAG GGG AGC C 

DIF-8365 

5 'CAA TGA TAT CTA GGT TGG CT 
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IMMUNE RESPONSES OF MICE AND RABBITS; INDIRECT 
IMMUNOFLUORESCENCE ASSAYS 



Dengue virus type 1 


No. of mice 




1 




recombinant 




L 


iitraiions ox i 




proteins 






IPA 


E 




11 




14.91 


E + 


CPA 


XV 




3 9. 52 


NS1 




10 




14.89 


NS2 




10 




12.05 


NS2 


+ CFA 


10 




12.07 


NS3 




11 




10.94 


NS3 


+ CFA 


10 




42.56 


NS5 




10 




7.94 


NS5 


+ CFA 


10 




10 .47 


E + 


NS1 


17 




16.66 


NS3 


+ NS1 


18 




10.87 


NS2 


+ NS3 


14 




9.23 


NS5 


+ NS3 


10 




32.14 


MBP 




4 




< 4 


GST 




4 




< 4 


PBS 




2 




< 4 


Dengue virus type % 


No. of rabbits 








recombinant 




i 


Titrations of 




proteins 






IPA 


E 




1 




160 


NS1 




1 




160 


NS2 


(67) 


1 




2560 


NS2 


(68) 


1 




640 


NS3 




1 




2560 


NS5 




1 




160 
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SEQUENCE LISTING 

(lj GENERAL INFORMATION: 

(i) APPLICANTS 

(A) NAME: National University of Singapore 
(B> STREET: 10 Kent Ridge Crescent 
(C) CITY: Singapore 

(E) COUNTRY: Singapore 

(F) POSTAL CODE (ZIP): 0511 . 
(ii) TITLE OF INVENTION: Dengue Virus 

(iii) NUMBER OF SEQUENCES: 12 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D> SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(v) CURRENT APPLICATION DATA: 
• APPLICATION NUMBER: 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10718 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Dengue Fever Virus Type 1 
(B> STRAIN: S275/90 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

<BJ LOCATION: 81.. 10268 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GTGGACCGCA AAGAACAGTT TCGAATCGGA AGCTTGCTTA ACGTAGTTCT AACAGTTTTT 60 

TATTAGAGAG CAGATCTCTG ATG AAC AAC CAA CGA AAA AAG ACG GCT CGA 110 

Met Asn Asn Gin Arg Lys Lys Thr Ala Arg 
1 5 - 10 

CCG TCT TTC AAT ATG CTG AAA CGC GCG AG A AAC CGC GTG TCA ACT GGT 158 
Pro Ser Phe Asn Met Leu Lys Arg Ala Arg Asn Arg Val Ser Thr Gly 
.15 20 25 

TCA CAG TTG GCG AAG AGA TTC TCA AAA GGA TTG CTT. TCA GGC CAA GGA 206 
Ser Gin Leu Ala Lys Arg Phe Ser Lys Gly Xeu Leu Ser Gly Gin Glv 
30 35 40 
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CCC ATG AAA TTG GTG ATG GCT TTC ATA GCA TTC CTA AGA TTT CTA GCC 
Pro Met Lys Leu Val Met Ala Phe He Ala Phe Leu Arg Phe Leu Ala 
45 so 55 

ATA CCC CCA ACA GCA GGA ATT TTG GCT AGA TGG GGC TCA TTC AAG AAG 
He Pro Pro Thr Ala Gly He Leu Ala Arg Trp Gly Ser Phe Lys Lys 
DO 65 70 

AAt GGA GCG ATC AAA GTG CTA CGG GGT TTC AAG AAA GAA ATC TCA AAC 
Asn Gly Ala He Lys Val Leu Arg Gly Phe Lys Lys Glu He Ser Asn 
75 80 85 90 

ATG TTG AAC ATA ATG AAT AGA AGG AAA AGA TCT GTG ACC ATG CTC CTC 
Met Leu Asn lie Met Asn Arg Arg Lys Arg Ser Val Thr Met Leu Leu 
95 100 105 

ATG CTG CTG CCC ACA GCC TTG GCG TTC CAT TTG ACT ACA CGA GGG GGA 
Met Leu Leu Pro Thr Ala Leu Ala Phe His Leu Thr Thr Arg Gly Gly 
110 115 120 

GAG CCA CAC ATG ATA GTT AGC AAG CAG GAA AGA GAA AAG TCA CTC TTG 
Glu Pro His Met He Val Ser Lys Gin Glu Arg Glu Lys Ser Leu Leu 
125 130 135 

TTT AAG ACC TCT GTA GGT GTC AAC ATG TGC ACC CTT ATA GCG ATG GAT 
Phe Lys Thr Ser Val Gly Val Asn Met Cys Thr Leu He Ala Met Asp 
140 145 iso 

TTG GGA GAG TTA TGT GAG GAC ACA ATG ACT TAC W TGC CCT CGA ATT 
Leu Gly Glu Leu Cys Glu Asp Thr Met Thr Tyr^ys Cys Pro Arg He 
160 165 ' 170 

ACT GAG GCG GAA CCA GAT GAC GTT GAT TGT TGG TGC AAT GCT ACA GAC 
Thr Glu Ala Glu Pro Asp Asp Val Asp Cys Trp Cys Asn Ala Thr Asp 
175 180 185 

ACA TGG GTG ACC TAT GGA ACA TGT TCC CAA ACT GGC GAG CAC CGA CGG 
Thr Trp Val Thr Tyr Gly Thr Cys Ser Gin Thr Gly Glu His Arg Arg 
190 195 200 

GAC AAA COT TCC GTC GCA CTG GCC CCA CAC GTG GGA CTT GGT CTA GAA 
Asp Lys Arg Ser Val Ala Leu Ala Pro His Val Gly Leu Gly Leu Glu 
205 210 215 

ACA AGA ACC GAA ACG TGG ATG TCC TCT GAA GGC GCT TGG AAA CAA ATA 
Thr Arg Thr Glu Thr Trp Met Ser Ser Glu Gly Ala Trp Lys Gin lie 

CAA AGA GTG GAG ACT TGG GCT TTG CGA CAC CCA GGA TTC ACG GTG ATA 
Gin Arg Val Glu Thr Trp Ala Leu Arg His Pro Gly Phe Thr Val He 
235 240 245 250 

GCC CTT TTT CTT GCA CAT GCC ATA GGA ACA TCC ATC ACT CAG AAA GGG 
Ala Leu Phe Leu Ala His Ala He Gly Thr Ser He Thr Gin Lys Glv 
255 260 265 

ATT ATT TTC ATT TTG TTA ATG CTA GTA ACA CCA TCC ATG GCC ATG CGA 
He He Phe He Leu Leu Met Leu Val Thr Pro Ser Met Ala Met Aro 
2 ™ 275 280 

TGC GTG GGA ATA GGC AGC AGG GAC TTC GTG GAA GGA CTA TCA GGA GCA 
Cys Val Gly lie Gly Ser Arg Asp Phe Val Glu Gly Leu Ser Gly Ala 
28 5 290 295 



254 



302 



350 



398 



446 



494 



542 



590 



638 



686 



734 



782 



830 



878 



926 



974 
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ACT TGG GTA GAC GTG GTA CTG GAA CAT GGA AGT TGC GTC ACC ACC ATG 1022 
Thr Trp Val Asp Val Val Leu Glu His Gly Ser Cys Val Thr Thr Met 
300 305 310 

GCA AAA GAC AAA CCA ACA TTG GAC ATT GAA CTC CTG AAA ACG GAG GTC 1070 
Ala Lys Asp Lys Pro Thr Leu Asp He Glu Leu Leu Lys Thr Glu Val 
315 " 320 325 330 

ACG AAC CCT GCC GTC CTG CGC AAA CTG TGC ATT GAA GCT AAA ATA TCA 1118 
Thr Asn Pro Ala Val Leu Arg Lys Leu Cys He Glu Ala Lys lie Ser 
335 • 340 345 

AAC ACC ACC ACC GAT TCA AGA TGT CCA ACA GAA GGA GAA GCT ACA CTG 1166 
Asn Thr Thr Thr Asp Ser Arg Cys Pro Thr Gin Gly Glu Ala Thr L6u 
350 355 360 

GTG GAA GAA CAA GAC GCG AAC TTT GTG TGT CGA CGA ACG TTC GTG GAC 1214 
Val Glu Glu Gin Asp Ala Asn Phe Val Cys Arg Arg Thr Phe Val Asp 
365 370 375 

AGA GGC TGG GGT AAT GGC TGC GGA CTA TTT GGA AAA GGA AGC CTA CTG 1262 
Arg Gly Trp Gly Asn Gly Cys Gly Leu Phe Gly Lys Gly Ser Leu Leu 
380 385 390 

ACG TGT GCT AAG TTC AAG TGT GTG ACA AAA CTA GAA GGA AAG ATA GTT 1310 
Thr Cys Ala Lys Phe Lys Cys Val Thr Lys Leu Glu Gly Lys lie Val 
395 400 405 410 

CAA TAT GAA AAC TTA AAA TAT TCA GTG ATA GTC ACT GTC CAC ACT GGG 1358 
Gin Tyr Glu Asn Leu Lys Tyr Ser Val lie Val Thr Val His Thr Gly 
415 420 425 

GAC CAG CAC CAG GTG GGA AAC GAG ACT ACA GAA CAT GGA ACA ATT GCA 1406 
Asp Gin His Gin Val Gly Asn Glu Thr Thr Glu His Gly Thr He Ala 
430 435 440 

ACC ATA ACA CCT CAA GCT CCT ACG TCG GAA ATA CAG CTG ACC GAC TAC 1454 
Thr He Thr Pro Gin Ala Pro Thr Ser Glu He Gin Leu Thr Asp Tyr 
445 450 455 

GGA GCC CTC ACA TTG GAC TGC TCA CCT AGA ACT GGG CTG GAC TTT AAT 1502 
Gly Ala Leu Thr Leu Asp Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn 
460 465 470 

GAG ATG GTG CTA TTG ACA ATG AAA GAA AAA TCA TGG CTT GTT CAC AAA 1550 
Glu Met Val Leu Leu Thr Met Lys Glu Lys Ser Trp Leu Val His Lys 
475 480 485 490 

CAA TGG TTT CTA GAC TTA CCA CTG CCT TGG ACT TCG GGG GCT TCA ACA 1598 
Gin Trp Phe Leu Asp Leu Pro Leu Pro. Trp Thr Ser Gly Ala Ser Thr 
495 500 505 

TCC CAA GAG ACT TGG AAC AGA CAA GAT TTG CTG GTC ACA TTC AAG ACA 1646 
Ser Gin Glu Thr Trp Asn Arg Gin Asp Leu Leu Val Thr Phe Lys Thr 
510 515 520 

GCT CAT GCA AAG AAG CAG GAA GTA GTC GTA CTG GGA TCA CAG GAA GGA 1694 
Ala His Ala Lys Lys Gin Glu Val Val Val Leu Gly Ser Gin Glu Gly 
525 530 535 

GCA ATG CAC ACT GCG TTG ACT GGG GCG ACA GAA ATC CAA ACG TCT GGA 1742 
Ala Met His Thr Ala Leu Thr Gly Ala Thr Glu He Gin Thr Ser Gly 
540 545 550 
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ACG ACA ACA ATT TTT GCA GGA CAC CTG AAA. TGT AGA CTA AAA ATG GAC 1790 
Thr Thr Thr lie Phe Ala Gly His Leu Lys Cys Arg Leu Lys Met Asp 
555 560 565 570 

AAA CTG ACT CTA AAA GGG ATG TCA TAT GTG ATG TGC ACA GGC TCA TTT 1838 
Lys Leu Thr Leu Lys Gly Met Ser Tyr Val Met Cys Thr Gly Ser Phe 
575 580 585 

AAG CTA GAG AAG GAA GTG GCT GAG ACC CAG CAT GGA ACT GTT TTA GTG 1886 
Lys Leu Glu Lys Glu Val Ala Glu Thr Gin His Gly Thr Val Leu Val 
590 595 * 600 

CAG GTT AAA TAC GAA GGA ACA GAT GCA CCA TGC AAG ATC CCC TTT TCG 1934 
Gin Val Lys Tyr Glu Gly Thr Asp Ala Pro Cys Lys lie Pro Phe Ser 
605 610 615 

ACC CAA GAT GAG AAA GGA GTG ACC CAG AAT AGA TTG ATA ACA GCC AAT 1982 
Thr Gin Asp Glu Lys Gly Val Thr Gin Asn Arg Leu lie Thr Ala Asn 
620 625 630 

CCT ATA GTT ACT GAC AAA GAA AAA CCA GTC AAC ATT GAG ACA GAA CCA 2030 
Pro He Val Thr Asp Lys Glu Lys Pro Val Asn He Glu Thr Glu Pro 
635 640 645 650 



Pro Phe Gly Glu Ser Tyr He Val Val Gly Ala Gly Glu Lys Ala Leu 
655 660 665 

AAA CAA TGC TGG TTC AAG AAA GGA AGC AGC ATA GGG AAA ATG TTC GAA 2126 

Lys Gin Cys Trp Phe Lys Lys Gly Ser Ser He Gly Lys Met Phe Glu 
670 675 680 

GCA ACC GCC CGA GGA GCA CGA AGG ATG GCT ATC CTG GGA GAC ACC GCA 2174 

Ala Thr Ala Arg Gly Ala Arg Arg Met Ala He Leu Gly Asp Thr Ala 
685 690 695 

TGG GAC TTC GGT TCT ATA GGA GGA GTG TTC ACG TCT GTG GGA AAA TTA 2222 

Trp Asp Phe Gly Ser He Gly Gly Val Phe Thr Ser Val Gly Lys Leu 
700 705 710 

GTG CAT CAG GTT TTT GGA ACC GCA TAT GGG GTT CTG TTC AGC GGT GTT 2270 

Val His Gin Val Phe Gly Thr Ala Tyr Gly Val Leu Phe Ser Gly Val 

715 720 725 730 

TCT TGG ACC ATG AAA ATA GGA ATA GGG ATT CTG CTG ACA TGG TTG GGA 2318 

Ser Trp Thr Met Lys He Gly He Gly He Leu Leu Thr Trp Leu Gly 
735 740 745 

TTA AAT TCA AGG AGC ACG TCA CTT TCG ATG ACG TGC ATT GCA GTT GGC 2366 

Leu Asn Ser Arg Ser Thr Ser Leu Ser Met Thr Cys He Ala Val Gly 
750 755 760 

ATG GTC ACA CTG TAC CTA GGA GTC ATG GTT CAA GCG GAC TCG GGA TGT 2414 

Met Val Thr Leu Tyr Leu Gly Val Met Val Gin Ala Asp Ser Gly Cys 
765 770 775 

GTA ATC AAC TGG AAG GGC AGA GAA CTC AAA TGT GGA AGT GGC ATT TTT 2462 

Val He Asn Trp Lys Gly Arg Glu Leu Lys Cys Gly Ser Gly He Phe 
780 785 790 

GTC ACT AAT GAA GTC CAC ACT TGG ACA GAG CAA TAC AAA TTT CAA GCT 2510 

Val Thr Asn Glu Val His Thr Trp Thr Glu Gin Tyr Lys Phe Gin Ala 

795 800 805 * 810 
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GAC TCC CCA AAA AGA CTA TCA GCA GCC ATC GGA AAG GCA TGG GAG GAG 2558 
Asp Ser Pro Lys Arg Leu Ser Ala Ala He Gly Lys Ala Trp Glu Glu 
SIS 820 825 

GGT GTG TGT GGA ATT CGA TCA GCC ACT CGT CTC GAG. AAC ATC ATG TGG 2606 
Gly Val Cys Gly He Arg Ser Ala Thr Arg Leu Glu Asn He Met Trp 
830 835 840 

AAG CAA ATA TCA AAT GAA CTG AAC CAC ATC TTA CTT GAA AAT GAC ATG 2654 
Lys Gin He Ser Asxt Glu Leu Asn His He Leu Leu Glu Asn Asp Met 
H45 850 855 

AAA TTC ACA GTG GTT GTA GGA GAT GTT GTT GGG ATC TTG GCC CAA GGG 2702 
Lys Phe Thr Val Val Val Gly Asp Val Val Gly He Leu Ala Gin Gly 
860 865 870 

AAA AAA ATG ATT AGA CCA CAA CCC ATG GAA CAC AAA TAC TCA TGG AAA 2750 
Lys Lys Met He Arg Pro Gin Pro Met Glu His Lys Tyr Ser Trp Lys 
375 880 885 890 

AGC TGG GGA AAA GCC AAA ATC ATA GGA GCA GAC ATA CAG AAC ACC ACC 2798 
Ser Trp Gly Lys Ala Lys He He Gly Ala Asp He Gin Asn Thr Thr 
895 900 905 

TTC ATC ATT GAC QGC CCA GAT ACT CCA GAA TGT CCT GAT GAC CAA AGA 2846 
Phe tie He Asp Gly Pro Asp Thr Pro Glu Cys Pro Asp Asp Gin Arg 
910 915 920 

GCA TGG AAC ATT TGG GAA GTT GAG GAC TAT GGG TTC GGA ATT TTC ACG 2894 
Ala Trp Asn He Trp Glu Val Glu Asp Tyr Gly Phe Gly He Phe Thr 
925 930 935 

ACA AAC ATA TGG TTG AAA TTG CGT GAC TCC TAC ACC CAA ATG TGT GAC 2942 
Thr Asn He Trp Leu Lys Leu Arg Asp Ser Tyr Thr Gin Met Cys Asp 
940 945 950 

CAC CGG CTA ATG TCA GCT GCC ATC AAG GAC AGC AAG GCA GTC CAT GCT 2990 
His Arg Leu Met Ser Ala Ala lie Lys Asp Ser Lys Ala Val His Ala 
955 960 965 970 - 

GAT ATG GGG TAC TGG ATA GAA AGT GAA AAG AAC GAG ACC TGG AAG CTG 3038 
Asp Met Gly Tyr Trp He Glu Ser Glu Lys Agn Glu Thr Trp Lys Leu 
975 980 985 

GCA AGA GCC TCT TTC ATA GAA GTT AAA ACA TGT GTC TGG CCA AAA TCC 3086 
Ala Arg Ala Ser Phe He Glu* Val Lya Thr Cys Val Trp Pro Lys Ser 
990 995 ~ 1000 

CAC ACT CTA TGG AGC AAT GGA GTT CTG GAA AGT GAA ATG ATA- ATT CCA 3134 
His Thr Leu Trp Ser Asn Gly Val Leu Glu Ser Glu Met He He Pro 
1005 1010 1015 

AAG ATC TAT GGA GGA CCA ATA TCT CAG CAC AAC TAC AGA CCA GGA TAT 3182 
Lys He Tyr Gly Gly Pro lie Ser Gin His Asn Tyr Arg Pro Gly Tyr 
1020 1025 1030 

TTC ACA CAA ACG GCA GGG CCA TGG CAC CTA GGC AAG TTG GAA CTG GAT 3230 
Phe Thr Gin Thr Ala Gly Pro Trp His Leu Gly Lys Leu Glu Leu Asp 
1035 1Q40 " 1045 1050 

TTT GAT TTG TGT GAG GGT ACC ACA GTT GTT GTG GAT GAA CAT TGT GGA 3278 
Phe Asp Leu Cys Glu Gly Thr Thr Val Val Val Asp Glu His Cys Gly 
1055 1060 " 1065 
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AAT CGA GGT CCA TCT CTT AGA ACC ACA ACA GTC ACA GGA AAG ATA ATT 3326 
Asn Arg Gly Pro Ser Leu Arg Thr Thr Thr Val Thr Gly Lys lie He 
1070 1075 .1080 

CAT GAA TGG TGT TGC AGA TCT TGT ACG CTA CCA CCC TTA CGT TTC AAA 3374 
His Glu Trp Cys Cys Arg Ser Cys Thr Leu Pro Pro Leu Arg Phe Lys 
1085 1090 1095 

GGA GAA GAT GGA TGT TGG TAC GGT ATG GAA ATC AGA CCA GTC AAG GAA 3422 
Gly Glu Asp Gly Cys Trp Tyr Gly Met Glu lie Arg Pro Val Lys Glu 
1100 1105 1110 

AAG GAA GAG AAT CTA GTC AAA TCA ATG GTC TCT GCA 
LyB Glu Glu Asn Leu Val Lys Ser Met Val Ser Ala 
1115 1120 1125 

GTG GAC AGC TTT TCA CTA GGA CTG CTA TGC ATA TCA 
Val Asp Ser Phe Ser Leu Gly Leu Leu Cys He Ser 
1135 1140 

GAG GTG ATG AGA TCC AGA TGG AGC AGA AAA ATG CTG 
Glu Val Met Arg Ser Arg Trp Ser Arg Lys Met Leu 
1150 • 1155 

CTG GCx GTG TTC CTC CXX CXC ATA ATG GGA CAA TTG 
Leu Ala Val Phe Leu Leu Leu. lie Met Gly Gin Leu 
1165 1170 

CTG ATC AGG TTA TGC ATC ATG GTT GGA GCC AAT GCT 
Leu He Arg Leu Cys lie Met Val Gly Ala Asn Ala 
1180 1185 1190 

GGG ATG GGA ACA ACG TAC CTA GCT CTG ATG GCC ACT TTT AAA ATG AGA 3710 
Gly Met Gly Thr Thr Tyr Leu Ala Leu Met Ala Thr Phe Lys Met Arg 
1195 1200 1205 1210 

CCA ATG TTT GCT GTC GGG CTG TTG TTC CGC AGA CTA ACA TCT AGA GAA 3758 
Pro Met Phe Ala Val Gly Leu Leu Phe Arg Arg Leu Thr Ser Arg Glu 
1215 1220 1225 

GTT CTT CTT CTT ACA ATT GGA TTG AGT CTA GTG GCA TCT GTG GAG TTA 3806 
Val Leu Leu Leu Thr lie Gly Leu Ser Leu Val Ala Ser Val Glu Leu 
1230 1235 1240 

CCA AAT TCC CTG GAG GAG CTG GGG GAT GGA CTT GCA ATG GGC ATT ATG 3854 
Pro Asn Ser Leu Glu Glu Leu Gly Asp Gly Leu Ala Met Gly He Met 
1245 1250 1255 

ATT TTA AAA TTA TTG ACT GAC TTT CAG TCA CAT CAG CTG TGG GCT ACC 3902 
He Leu Lys Leu Leu Thr Asp Phe Gin Ser His Gin Leu Trp Ala Thr 
1260 1265 1270 

TTG CTG TCC TTG ACA TTT GTC AAA ACA ACG TTT TCC TTG CAC TAT GCA 3950 
Leu Leu Ser Leu Thr Phe Val Lys Thr Thr Phe Ser Leu His Tyr Ala 
1275 1280 1285 1290 

TGG AAG ACA ATG GCT ATG GTA CTG TCA ATT GTA TCT CTC TTC CCC TTA 3998 
Trp Lys Thr Met Ala Met Val Leu Ser He Val Ser Leu Phe Pro Leu 
1295 1300 1305 

TGC CTG TCC ACG ACC TCC CAA AAA ACA ACA TGG CTT CCG GTG CTA TTG 4046 
Cys Leu Ser Thr Thr Ser Gin Lys Thr Thr Trp Leu Pro Val Leu Leu 
1310 1315 1320 



GGG TCA GGG GAA 3470 
Gly Ser Gly Glu 
1130 

ATA ATG ATC GAA 3518 
He Met He Glu 
1145 

ATG ACT GGA ACA 3566 
Met Thr Gly Thr 
1160 

ACA TGG AAT GAT 3614 

Thr Trp Asn Asp 

1175 

TCA GAC AGG ATG 3662 
Ser Asp Arg Met 
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GGA TCT CTT GGATGC MA CCA CTA ACC ATG TTT. CTC ATA GCA GAA AAC 4094 
Gly Ser Leu Gly Cys Lys Pro Leu Thr Met Phe Leu lie Ala Glu Asn 
1325 1330 1335 

AAA ATC TGG GGA AGG AAA AGT TGG CCC CTC AAT GAA GGA ATC ATG GCT 4142 
Lys He Trp Gly Arg Lys Ser Trp Pro Leu Asit Glu Gly He Met Ala 
1340 1345 135G 

GTT GGA ATA GTC AGC ATC CTA CTA AGT TCA CTC CTC AAA AAT GAT GTG 4190 
Val Gly He Val Ser He Leu Leu Ser Ser Leu Leu Lys Asn Asp Val 
1355 1360 1365 * 1370. 

CCG CTA GCT GGG CCA CTA ATA GCT GGA GGC ATG CTA ATA GCA TGT TAC 4238 
Pro Leu Ala Gly Pro Leu He Ala Gly Gly Met Leu He Ala Cys Tyr 
1375 . . 1380 1385 

GTT ATA TCT GGA AGC TCA GCC GAC TTA TCA CTA GAG AAA. GCG GCT GAG 4286 
Val He Ser Gly Ser Ser Ala Asp Leu Ser Leu Glu Lys Ala Ala Glu 
1390 1395 1400 

GTC TCC TGG GAA GAA GAA GCA GAA CAC TCT GGT GCC TCA CAC AAT? ATA 4334 
Val Ser Trp Glu Glu Glu Ala Glu His Ser Gly Ala Ser His Asn He 
1405 1410 1415 

TTA GTG GAG GTC CAA GAT GAT GGA ACC ATG AAG ATA AAA GAT GAA GAG 4382 
Leu Val Glu Val Gin Asp Asp Gly Thr Met Lys He Lys Asp Glu Glu 
1420 1425 1430 

AGA GAT GAC ACG CTA ACC ATT CTC CTT AAA GCA ACC CTG CTA GCA GTT 4430 
Arg Asp Asp Thr Leu Thr He Leu Leu Lys Ala Thr Leu Leu Ala Val 
.1435 144G . 1445 1450 

TCA GGG GTG TAC CCA TTA TCA ATA CCA GCA ACC CTT TTT GTG TGG TAC 4478 
Ser Gly Val Tyr Pro Leu Ser He Pro Ala Thr Leu Phe Val Trp Tyr 
1455 1460 1465 

TTT TGG CAG AAA AAG AAA CAA AGA TCT GGA GTG TTA TGG GAC ACA CCT . 4526 

Phe Trp Gin Lys Lys Lys Gin Arg Ser Gly Val Leu Trp Asp Thr Pro 
1470 1475 1480 

AGC CCT CCA GAA GTG GAA AGA GCA GTC CTT GAT GAT GGT ATC TAT AGA 4574 
Ser Pro Pro Glu Val Glu Arg Ala Val Leu Asp Asp Gly lie Tyr Arg 
1485 1490 1495 

ATT ATG CAG AGA GGA CTG TTG GGC AGG TCC CAA GTA GGA GTG GGA GTT 4622 
He Met Gin Arg Gly Leu Leu Gly Arg Ser Gin Val Gly Val Gly Val 
1500 1505.- 1510 

TTC CAA GAC GGC GTG TTC CAC ACA ATG TGG CAC GTC ACC AGG GGA GCT 4670 
Phe Gin Asp Gly Val Phe His Thr Met Trp His Val Thr Arg Gly Ala 
1515 1520 1525 1530 

GTC CTT ATG TAC CAA GGG AAG AGG CTG GAA CCA AGC TGG GCC AGT GTC 4718 
Val Leu Met Tyr Gin Gly Lys Arg Leu Glu Pro Ser Trp Ala Ser Val 
1535 1540 1545 

AAA AAA GAC TTG ATC TCA TAT GGA GGA GGT TGG AGG TTT CAA GGA TCC 4766 
Lys Lys Asp Leu He Ser Tyr Gly Gly Gly Trp Arg Phe Gin Gly Ser 
1550 1555 1560 

TGG AAC ACG GGA GAA GAA GTG CAG GTG ATT GCT GTT GAA CCA GGA AAA 4814 
Trp Asn Thr Gly Glu Glu Val Gin Val He Ala Val Glu Pro Gly Lys 
1565 1570 1575 
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AAC CCC AAA AAT GTA CAG ACA GCG CCG GGT ACC TTC AAG ACC CCT GAA 4862 
Asn Pro Lya Asn Val Gin Thr Ala Pro Gly Thr Phe Lys Thr Pro Glu 
1580 1585 1590 

GGT GAA GTT GGA GCT- ATT GCC CTA GAT TTT AAA CCC GGC ACA TCT GGA 4910 
Gly Glu Val Gly Ala lie Ala Leu Asp Phe Lys Pro Gly Thr Ser Gly 
159s . 1600 1605 * 1610 

TCT CCC ATC GTG AAC AGA GAA GGA AAA ATA GTA GGT CTT TAT GGA AAT 4958 
Ser Pro He Val Asn Arg Glu Gly Lys He Val Gly Leu Tyr Gly Asn 
1615 1620 1625 

GGA GTA GTG ACA ACA AGT GGA ACC TAG GTC AGT GCC ATA GCC CAA GCC 5006 
Gly Val Val Thr Thr Ser Gly Thr Tyr Val Ser Ala lie Ala Gin Ala 
1630 1635 1640 

AAA GCA TCA CAA GAA GGG CCC CTA CCA GAG ATT GAG GAC GAG GTG TTT 5054 
Lys Ala Ser Gin Glu Gly Pro Leu Pro Glu He Glu Asp Glu Val Phe 
1645 1650 1655 

AGG AAA AGA AAC TTA ACA ATA ATG GAC CTA CAT CCA GGA TCG GGG AAA 5102 
Arg Lys Arg Asn Leu Thr He Met Asp Leu His Pro Gly Ser Gly Lvs 
1660 1665 " 1670 

ACA AGA AGA TAT CTT CCA GCC ATA GTC CGT GAG GCC ATA AGA AGG AAC 5150 
Thr Arg Arg Tyr Leu Pro Ala He Val Arg Glu Ala He Arg Arg Asn 
1675 1680 1685 1690 

GTG CGC ACA CTA ATT TTG GCT CCC ACA AGG GTT ; feTC GCT TCC GAA ATG 5198 
Val Arg Thr Leu He Leu Ala Pro Thr Arg Val val Ala Ser Glu Met 
1695 1700 1705 

GCA GAG GCG CTC AAG GGA ATG CCA ATA AGG TAC CAA ACA ACA GCA GTG 5246 
Ala Glu Ala Leu Lys Gly Met Pro He Arg Tyr Gin Thr Thr Ala Val 
1710 1715 1720 

AAG AGT GAA CAC ACA GGA AAA GAG ATA GTT GAC CTC ATG TGT CAC GCC 5294 
Lys Ser Glu His Thr Gly Lys Glu He Val Asp Leu Met Cys His Ala 
1725 1730 1735 

ACT TTC ACC ATG CGT CTC CTG TCT CCC GTG AGA GTT CCC AAT TAC AAC 5342 
Thr Phe Thr Met Arg Leu Leu Ser Pro Val Arg Val Pro Asn Tyr Asn 
1740 1745 1750 

ATG ATT ATC ATG GAT GAA GCA CAT TTT ACC GAT CCA GCC AGC ATA GCG 5390 

V2h MQt Asp Glu Ala His phe Thr As P Pro Ala Ser He Ala 

1755 1760 1765 1770 

CGC AGA GGG TAC ATC TCA ACC CGA GTG GGC ATG GGT GAA GCA GCT GCG 5438 
Arg Arg Gly Tyr He Ser Thr Arg Val Gly Met Gly Glu Ala Ala Ala 
1775 1780 1785 

ATC TTC ATG ACA GCC ACT CCC CCA GGA TCG GTG GAG GCC TTT CCA CAG 5486 
He Phe Met Thr Ala Thr Pro Pro Gly Ser Val Glu Ala Phe Pro Gin 
1790 1795 1800 

AGC AAT GCA GTT ATC CAA GAT GAG GAA AGA GAC ATT CCT GAG AGA TCA 5534 
Ser Asn Ala Val He Gin Asp Glu Glu Arg Asp He Pro Glu Arg Ser 
1805 1810 1815 

TGG AAC TCA GGC TAT GAG TGG ATC ACT GAC TTC CCA GGT AAA ACA GTC 5S82 

P ™ SSr Gly Tyr Glu Trp Ile Thr As P Phe *™ Gly Lys Thr Val 
1820 1825 1830 
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TGG TTT GTT CCA AGC ATC AAA TCA GGA, AAT- GAC ATT GCC AAC TGC TTA 5630 
Trp Phe Val Pro Ser He Lys Ser Gly Asn Asp He Ala Asn Cys Leu 
1835. 1840 1845 1850 

AGA AAG AAT GGG AAA CGG GTG ATT CAA TTG AGC AGG AAA ACC TTT GAT 5678 
Arg Lys Asn Gly Lys Arg Val lie Gin Leu Ser Arg Lys Thr Phe Asp 
1855 . 1860 1865 

ACA GAG TAC CAA AAA ACA AAA AAC AAC GAC TGG GAC TAT GTC GTC ACA 5726 
Thr Glu Tyr Gin £ys Thr Lys Asn Asn Asp Trp Asp Tyr Val Val Thr 
1870 1875 1880 

ACA GAT ATC TCC GAA ATG GGA GCA AAC TTC CGA GCC GAC AGG GTG ATA 5774 
thr Asp He Ser Glu Met Gly Ala Asn Phe Arg Ala Asp Arg Val He 
1885 1890 1895 

GAC CCA AGA CGG TG? CTG AAA CCG GTA ATA CTA AAA GAT GGT CCA GAG 5822 
Asp Pro Arg Arg Cys Leu Lys Pro Val He Leu Lys Asp Gly Pro Glu 
1900 ~ 1905 1910 

CGC GTC ATT CTA GCC GGA CCG ATG CCA GTG ACT GTG GCC AGT GCT GCC 5870 
Arg Val He Leu Ala Gly Pro Met Pro Val Thr Val Ala Ser Ala Ala 
1915 192G 1925 1930 

CAG AGG AGA GGA AGA ATT GGA AGG AAC CAA AAC AAA GAA GGT GAT CAG 5918 
Gin Arg Arg Gly Arg He Gly Arg Asn Gin Asn Lys Glu Gly Asp Gin 
1935 1940 * 1945 

TAC GTT TAC ATG GGA CAG CCT TTA AAT AAT GAT GAG GAT CAC GCT CAT 5966 
Tyr Val Tyr Met Gly Gin Pro Leu Asn Asn Asp Glu Asp His Ala His 
1950 1955 1960 

TGG ACA GAA GCA AAA ATG CTC CTT GAC AAT ATA AAC ACA CCA GAA GGG 6014 
Trp Thr Glu Ala Lys Met Leu Leu Asp Asn He Asn Thr Pro Glu Gly 
1965 1970 1975 

ATC ATC CCA GCC CTC TTT GAG CCA GAG AGA GAA AAG AGT GCA GCA ATA 6062 
He He Pro Ala Leu Phe Glu Pro Glu Arg Glu Lys Ser Ala Ala He 
1980 1985 1990 

GAC GGG GAG TAC AGA CTG CGG GGA GAA GCA AGA AAA ACG TTT GTG GAG 6110 
Asp Gly Glu Tyr Arg Leu Arg Gly Glu Ala Arg Lys Thr Phe Val Glu 
1995 2000 2005 2010 

CTC ATG AGA AGA GGA GAT CTA CCT GTC TGG CTA TCC TAC AAA GTT GCC (5158 
Leu Met Arg Arg Gly Asp teu Pro Val Trp Leu Ser Tyr Lys Val Ala 
20X5 2020 2025 

TCA GAA GGC TTC CAG TAC TCT GAC AGA AGA TGG TGC TTT GAC GGG GAA 6206 
Ser Glu Gly Phe Gin Tyr Ser Asp Arg Arg Trp Cys Phe Asp Gly Glu 
2030 2035 " 2040 

AGG AAC AAC CAG GTG TTG GAG GAG AAC ATG GAC GTG GAG ATG TGG ACA 6254 
Arg Asn Asn Gin Val Leu Glu Glu Asn Met Asp Val Glu Met Trp Thr 
2045 2050 ... 2055 

AAA GAA GGA GAA CGA AAG AAA CTA CGA CCC CGC TGG CTG GAT GCC AGA 6302 
Lys Glu Gly Glu Arg Lys Lys Leu Arg Pro Arg Trp Leu Asp Ala Arg 
2060 2065 2070 

ACA TAC TCA GAC CCA CTG GCC CTG CGC GAG TTT AAA GAG TTT GCA GCA 6350 
Thr Tyr Ser Asp Pro Leu Ala Leu Arg Glu Phe Lys Glu Phe Ala Ala 
2075 2080 208S 2090 
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GGA AGA AGA AGT GTC TCA GGT JSAT CTA ATA TTA' GAA ATA GGG AAA CTT 6398 
Gly Arg Arg Ser Val Ser Gly Asp Leu lie Leu Glu lie Gly Lys Leu 
2095 2100 2105 

CCA CAA CAC TTG ACG CAA AGG GCC CAG AAT GCC TTG GAC AAC CTG GTT 6446 
Pro Gin Hid Leu Thr Gin Arg Ala Gin Asn Ala Leu Asp Asn Leu Val 
2110 2115 2120 

ATG TTG CAC AAC TCC GAA CAA GGA GGA AGA GCC TAC AGA CAT GCA ATG 6494 
Met Leu His Asn Ser Glu Gin Gly Gly Arg Ala Tyr Arg His Ala Met 
2125 2130 2135 

GAA GAA CTT CCA GAC ACC ATA GAA ACG TTG ATG CTC CTA GCT TTG ATA 6542 
Glu Glu Leu Pro Asp Thr lie Glu Thr Leu Met Leu Leu Ala Leu lie 
2140 2145 2150 

GCT GTG TTA ACT GGT GGA GTG ACG CTG TTC TTC CTA TCA GGA AAG GGC 6590 
Ala Val Leu Thr Gly Gly Val Thr Leu Phe Phe Leu Ser Gly Lys Gly 
2155 2160 2165 2170 

CTA GGG AAA ACA TCT ATT GGC CTA CTC TGC GTG ATG GCT TCA AGC GTA 6638 
Leu Gly Lys Thr Ser lie Gly Leu Leu Cys Val Met Ala Ser Ser Val 
2175 2180 2185 

CTG CTA TGG ATG GCC AGC GTG GAG COT CAT TGG ATA GCG GCC TCC ATC SSS6 
Leu Leu Trp Met Ala Ser Val Glu Pro His Trp He Ala Ala Ser lie 
2190 2195 2200 

ATA CTA GAG TTT TTC CTG ATG GTG CTG CTT ATT CCA GAG CCA GAC AGA 6734 
He Leu Glu Phe, Phe Leu Met Val Leu Leu lie Pro Glu Pro Asp Arg 
2205 2210 2215 

CAG CGC ACT CCA CAG GAC AAC CAG TTA GCA TAT GTG GTG ATA GGT TTG 6782 
Gin Arg Thr Pro Gin Asp Asn Gin Leu Ala Tyr Val Val He Gly Leu 
2220 2225 2230 

TTA TTC ATG ATA CTC ACA GTG GCA GCC AAT GAG ATG GGA TTA TTG GAA 6830 
Leu Phe Met He Leu Thr Val Ala Ala Asn Glu Met Gly Leu Leu Glu 
2235 2240 2245 2250 

ACC ACA AAG AAA GAC TTA GGG ATT GGC CAT GTA GCC GCC GAA AAC CAC 6878 
Thr Thr Lys Lys Asp Leu Gly He Gly His Val Ala Ala Glu Asn His 
2255 2260 2265 

CAC CAT GCT ACA ATG CTG GAC GTA GAC CTA CGT CCA GCT TCA GCC TGG 6926 
His His Ala Thr Met Leu Asp Val Asp Leu Arg Pro Ala Ser Ala Trp 
2270 2275 2280 

ACC CTC TAT GCA GTA GCC ACA ACA GTT ATC ACC CCC ATG ATG AGA CAC 6974 
Thr Leu Tyr Ala Val Ala Thr Thr Val He Thr Pro Met Met Arg His 
2285 2290 2295 

ACA ATT GAA AAT ACA ACG GCA AAT ATT TCC CTG ACA GCC ATT GCA AAC 7022 
Thr He Glu Asn Thr Thr Ala Asn He Ser Leu Thr Ala He Ala Asn 
2300 2305 2310 

CAG GCA GCT ATA TTG ATG GGA CTT GAT AAA GGA TGG CCA ATA TCG AAG 7070 
Gin Ala Ala He Leu Met Gly Leu Asp Lys Gly Trp Pro lie Ser Lys 
2315 2320 2325 2330 

ATG GAC ATA GGA GTT CCA CTT CTC GCC TTG GGG TGC TAT TCC CAG GTG 7118 
Met Asp lie Gly Val Pro Leu Leu- Ala Leu Gly Cys Tyr Ser Gin Val 
2335 2340 2345 
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AAT CCA CTG ACG CTG ACA GCG GCG GTA TTG ATG CTA GTG GCT CAT TAC 
Asn Pro Leu Thr Leu Thr Ala Ala Val Leu Met Leu Val Ala His Tyr 
2350 2355 2360 

GCC ATA ATT GGA CCT GGA CTG CAA GCA AAA GCG ACT AGA GAA GCT CAA 
Ala lie lie Gly Pro Giy Leu Gin Ala Lys Ala Thr Arg Glu Ala Gin 
2365 2370 2375 

AAA AGG ACA GCG GCC GGA ATA ATG AAA AAT CCA ACC GTT GAT GGA ATT 
Lys Arg Thr Ala Ala Gly He Met Lys Asn Pro Thr Val Asp Gly He 
2380 2385 2390 

GTT GCA ATA GAT TTG GAC CCT GTG GTT TAT GAT GCA AAA TTT GAG AAA 
Val Ala lie Asp Leu Asp Pro Val Val Tyr Asp Ala Lys Phe Glu Lys 
2395 2400 2405 2410 

CAA CTA GGC CAA ATA ATG TTG TTG ATA CTA TGC ACA TCA CAG ATC CTC 
Gin Leu Gly Gin He Met Leu Leu He Leu Cys Thr Ser Gin lie Leu 
2415 2420 2425 

TTG ATG CGG ACT ACA TGG GCC TTG $GT GAA TCC ATC ACA CTG GCC ACT 
Leu Met Arg Thr Thr Trp Ala Leu Cys Glu Ser lie Thr Leu Ala Thr 
2430 2435 2440 

GGA CCT CTG ACC ACG CTT TGG GAG GGA TCT CCA GGA AAA TTT TGG AAC 
Gly Pro Leu Thr Thr Leu Trp Glu Gly Ser Pro Gly Lys Phe Trp Asn 
2445 2450 2455 

ACC ACG ATA GCG GTT TCC ATG GCA AAC ATT TTC AGG GGA AGT TAT CTA 
Thr Thr He Ala Val Ser Met Ala Asn He Phe Arg Gly Ser Tyr Leu 
2460 2465 247G 



GCA GGA GCA GGC CTG GCC TTC TCA TTA ATG AAA TCT CTA GGA GGA GGT 
Ala Gly Ala Gly Leu Ala Phe Ser Leu Met Lys Ser Leu Gly Gly Glv 
2475 2480 2485 2490 



AGG AGA GGT ACG GGA GCC AAG GGG AAA CAC TGG GAG AGA AAT GGA AAA 
Arg Arg Gly Thr Gly Ala Lys Gly Lys His Trp Glu Arg Asn Gly Lys 
2495 2500 2505 

GAC AGA CTG AAC CAA CTG AGC AAG TCA GAA TTC AAC ACT TAC AAA AGG 
Asp Arg Leu Asn Gin Leu Ser Lys Ser Glu Phe Asn Thr Tyr Lys Arg 
2510 2515 2520 

AGT GGG ATT ATG GAA GTG GAC AGA TCC GAA GCC AAA GAG GGA CTG AAA 
Ser Gly He Met Glu Val Asp .Arg Ser Glu Ala Lys Glu Gly Leu Lys 
2525 2530 2535 



AGA GGA GAA ACA ACC AAA CAT GCA GTG TCG AGA GGA ACC GCC AAA TTG 
Arg Gly Glu Thr Thr Lys His Ala Val Ser Arg Gly Thr Ala Lys Leu 
2540 2545 2550 

AGG TGG TTC GTG GAG AGG AAC CTT GTG AAA CCA GAA GGG AAA GTC ATA 
Arg Trp Phe Val Glu Arg Asn Leu Val Lys Pro Glu Gly Lys Val He 
2555 2560 2565 2570 

GAC CTC GGT TGT GGA. AGA GGT GGC TGG TCA TAC TAT TGC GCT GGG CTG 
Asp Leu Gly Cys Gly Arg Gly Gly Trp Ser Tyr Tyr Cys Ala Gly Leu 
2575 2580 2585 

AAG AAA GTC ACA GAA 6TG AAG GGA TAC ACA AAA GGA GGA CCT GGA CAT 
Lys Lys Val Thr Glu Val Lys Gly Tyr Thr Lys Gly Gly Pro Gly His 
2590 2595 2600 
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S CA ATC CCA ATG GCG ACC TAT GGA TGG AAC CTA* CTA^ AAG CTA 
G1U «nc Ile Pr ° Met Ala Thr T * r G1 * Tr ? Asn Leu val Lys Leu 
2605 2610 2615 

TAC TCC GGG AAA GAC GTA TTC TTT ACA CCA CCT GAG AAG TGT GAC ACC 
Tyr Ser^Gly Lys Asp Val Ph^Phe Thr Pro Pro Glu^s Ifs Asp Sir 

CTT TTG TGT GAT ATT GGT GAG TCC TCT CCA AAC CCA ACT ATA GAA GAA 
Leu Leu cys Asp lie Gly Glu Ser Ser Pro Asn Pro £ lie G ™ Su 
2635 2640 2645 2650 

Si JJa Thr ™ f C S TC T CTA ATG GTG GAA CCA TGG CTC AGA GGG 
Gly Arg Thr Leu Arg Val Leu Lys Met Val Glu Pro Trp Leu Arq Glv 
2655 2660 2665 

AAC CWV TTT TGC ATA AAA ATT CTA AAT CCC TAC ATG CCA ACT GTG GTG 
Asn Gin Phe Cys lie Lys lie Leu Asn Pro Tyr Met Pro Ser Val Val 
26 "0 2675 2680 

GAA ACT CTG GAG CAA ATG CAA AGA AAA CAT GGA GGA ATG CTA GTG CGG 
Glu Thr Leu Glu Gin Met Gin Arg Lys His Gly Gly Met Leu Val Arg 
2685 2690 2695 

AAT CCA CTT TCA AGA AAT TCT ACT CAT GAA ATG TAT TGG GTT TCA TGT 
2700 ASO f70S Thr HiS G1U MSt 27 r ?rP " 

r? A Su A S GA AAC A F ° TG TCA GCA GTA MC ATG ACA TCT AGA ATG TTG 
Gly Thr Gly Asn lie Val Ser Ala Val Asn Met Thr Ser Arg Met Leu 
2715 2720 2725 2730 

f! A ^1 CSA U° S CA ATG GCT CAC AGG *** CCA ACA TAT GAA AGA GAC 
Leu Asn Arg Phe Thr Met Ala His Arg Lys Pro Thr Tyr Glu Arg Asp 
2735 2740 2745 

GTG GAC TTA CGC GCT GGA ACA AGA CAT GTG GCA GTG GAA CCA GAG GTA 
Val Asp Leu Gly Ala Gly Thr Arg His Val Ala Val Glu Pro Glu Val 
2750 2755 2760 

GCC AAC CTA GAT ATC ATT GGC CAG AGG ATA GAG AAC ATA AAA CAT GAA 
Ala Asn leu Asp lie He Gly Gin Arg lie Glu Asn lie Lys His Glu 
2765 2770 2775 

CAT AAG TCA ACA TGG CAT TAT GAT GAG GAC AAT CCA TAT AAA ACA TGG 

Thr Trp HiS S* ASP Glu As P Asn Pro Lys Thr Trp 
2'80 2785 2790 

GCC TAT CAT GGA TCA TAT GAG GTC AAG CCA TCA GGA TCA GCC TCA TCC 
Ala Tyr His Gly Ser Tyr Glu Val Lys Pro Ser Gly ler Ala s2 Ser 
2795 2800 2805 2810 

ill vll SSI G°v SI? F" CTC ACC *** CCA TGG GAT GCG ATC 

Met Val Asn Gly Val Val Lys Leu Leu Thr Lys Pro Trp Asp Ala He 

2815 2820 * 282S 

S ml SI? ^ ?? A GCC ATG ACT GAC ACC ACA CCC TTT GGA CAA 

Pro Met Val Thr Gin lie Ala Met Thr Asp Thr Thr Pro Phe Gly Gin 

2830 2835 2 840 

CAG AGG GTG TTT AAA GAG AAA GTT GAC ACG CGC ACA CCA AAA GCA AAA 
Gin Arg Val Phe Lys Glu Lys Val Asp Thr Arg Thr PrS JJi S £yU 
2845 2850 2855 
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CGA GGC ACA GCA GAA ATC ATG GAG GTG ACA GCC AGG TGG TTA TGG GGT 8702 
Arg Gly Thr Ala Gin lie Met Glu Val Thr Ala Arg Trp Leu Trp Gly 
2860 2865 2870 

TTT GTC TCT AGA AAC AAA AAA CCA AGA ATT TGT ACA AGA GAG GAG TTC 8750 
Phe Leu Ser Arg Asn Lys Lys Pro Arg lie Cys Thr Arg Glu Glu Phe 
2875 2880 2885 2890 

ACA AGA AAA GTT AGG TCA AAC GCA. GCC ATT GGA GCA GTG TTC GTT GAT 8798 
Thr Arg Lys Val Arg Ser Asn Ala Ala lie Gly Ala Val Phe Val Asp 
2895 2900 2905 

GAA AAT CAA TGG AAC TCA GCA AAA GAA GCA GTG GAA GAT GAG CGG TTC 8846 
Glu Asn Gin Trp Asn Ser Ala Lys Glu Ala Val Glu Asp Glu Arg Phe 
2910 2915 2920 

TGG GAC CTT GTG CAC AGA GAG AGG GAG CTT CAC AAA CAG GGA AAA TGT 8894 
Trp Asp Leu Val His Arg Glu Arg Glu Leu His Lys Gin Gly Lys Cys 
2925 ' 2930 * 2935 

GCC ACG TGT GTT TAC AAC ATG ATG GGG AAG AGA GAG AAA AAA CTA GGA 8942 
Ala Thr Cys Val Tyr Asn Met Met Gly Lys Arg Glu Lys Lys Leu Gly 
2940 2945 2950 

GAG TTC GGA AAG GCA AAA GGA AG! CGT GCA ATA TGG. TAC ATG TGG TTG 3990 
Glu Phe Gly Lys Ala Lys Gly Ser Arg Ala lie Trp Tyr Met Trp Leu 
2955 2960 2965 " 2970 

GGA GCA CGC TTT CTA GAG TTC GAA GCT CTT GGT TTC ATG AAC GAA GAT 9038 
Gly Ala Arg Phe Leu Glu Phe Glu Ala Leu Gly Phe Met Asn Glu Asp 
2975 2980 2985 

CAC TGG TTC AGT AGA GAG AAT TCA CTC AGT GGA GTG GAA GGA GAA GGA 9086 
His Trp Phe Ser Arg Glu Asn Ser Leu Ser Gly Val Glu Gly Glu Gly 
2990 2995 3000 

CTC CAC AAA CTC GGA TAT ATA CTC AGA GAC ATA TCA AAG ATT CCA GGG 9134 
Leu His Lys Leu Gly Tyr lie Leu Arg Asp lie Ser Lys He Pro Gly 
3005 3010 3015 

GGA AAT ATG TAT GCA GAT GAC ACA GCC GGA TGG GAT ACA AGG ATA ACA 9182 
Gly Asn Met Tyr Ala. Asp Asp Thr Ala Gly Trp Asp Thr Arg He Thr 
3020 3025 3030 

GAG GAT GAT CTT CAG AAT GAG GCC AAA ATT ACT GAC ATC ATG GAG CCC 9230 
Glu Asp Asp Leu Gin Asn Glu Ala Lys He Thr Asp lie Met Glu Pro 
3035 3040 3045 3050 

GAA CAT GCC CTA CTG GCT ACG TCA ATC TTC AAG CTG ACC TAC CAA AAT 9278 
Glu His Ala Leu Leu Ala Thr Ser He Phe Lys Leu Thr Tyr Gin Asn 
3055 3060 3065 

AAG GTG GTA AGG GTA CAG AGA CCA GCG AAA AAT GGA ACC GTG ATG GAT 9326 
Lys Val Val Arg Val Gin Arg Pro Ala Lys Asn Gly Thr Val Met Asp 
. 3070 3075 3080 

GTC ATA TCC AGA CGT GAC CAG AGA GGA AGT GGC CAG GTC GGA ACT TAT 9374 
Val He Ser Arg Arg Asp Gin Arg Gly Ser Gly Gin Val Gly Thr Tyr 
3085 3090 3095 

GGC TTA AAC ACT TTC ACT AAC ATG GAA GCC CAG CTA ATA AGA CAA ATG 9422 
Gly Leu Asn Thr Phe Thr Asn Met Glu Ala Gin Leu He Arg Gin Met 
3100 3105 3110 
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GAG TCT GAG GGA ATC TTT TCA CCC AGC GAA TTG GAG ACC CCA AAT TTA 9470 
Glu Ser Glu Gly lie Phe Ser Pro Ser Glu Leu Glu Thr Pro Asn Leu 
3115 3120 3125 3130 

GCC GAG AGA GTT CTC GAC TGG. CTG GAA AAA TAT GGC GTC GAA AGG CTG 9518 
Ala Glu Arg Val Leu Asp Trp Leu Glu Lys Tyr Gly Val Glu Arg Leu 
3135 3140 3145 

AAA AGA ATG GCA ATC AGC GGA GAT GAC TGC GTG GTG AAA CCA ATT GAT 9566 
Lys Arg Met Ala He Ser Gly Asp Asp Cys Val Val Lys Pro He Asp 
3150 3155 3160 

GAC AGG TTC GCA ACA GCC TTA ACA GCT CTG AAT GAT ATG GGA AAA GTA 9614 
Asp Arg Phe Ala Thr Ala Leu Thr Ala Leu Asn Asp Met Gly Lys Val 
3165 3170 3175 

AGA AAA GAT ATA CCA CAA TGG GAA CCC TCA AAA GGA TGG AAT GAT TGG 9662 
Arg Lys Asp lie Pro Gin Trp Glu Pro Ser Lys Gly Trp Asn Asp Trp , 
3180 3185 3190 

CAA CAG GTG CCT TTT TGT TCA CAC CAT TTC CAC CAG CTG ATT ATG AAG 9710 
Gin Gin Val Pro Phe Cys Ser His His Phe His Gin Leu He Met Lys 
3195 3200 3205 3210 

GAT GGG AGG GAA ATA GTG GTG CCA TGC CGC AAC CAA GAT GAA CTT GTG 9758 
Asp Gly Arg Glu He Val Val Pro Cys Arg Asn Gin Asp Glu Leu Val 
3215 3220 3225 

GGT AGG GCT AGA GTA TCA CAA GGT GCT GGA TGG rjGC CTG AGA GAA ACT 9806 
Gly Arg Ala Arg Val Ser Gin Gly Ala Gly Trp V** Leu Arg Glu Thr 
3230 3235 v " 3240 

GCA TGC CTA GGC AAG TCA TAT GCA CAA ATG TGG CAG CTG ATG TAC TTC 9854 
Ala Cys Leu Gly Lys Ser Tyr Ala Gin Met Trp Gin Leu Met Tyr Phe 
3245 3250 3255 

CAC AGG AGA GAC CTG AGA CTA GCT GCT AAT GCT ATC TGT TCA GCC GTT 9902 
His Arg Arg Asp Leu Arg Leu Ala Ala Asn Ala He Cys Ser Ala Val 
3260 3265 3270 

CCA GTT GAT TGG GTC CCA ACC AGC CGC ACC ACT TGG TCG ATC CAT GCC 9950 
Pro Val Asp Trp Val Pro Thr Ser Arg Thr Thr Trp Ser lie His Ala 
3275 3280 3285 3290 

CAT CAC CAA TGG ATG ACA ACA GAA GAC ATG TTG TCA GTG TGG AAT AGG 9998 
His His Gin Trp Met Thr Thr Glu Asp Met Leu Ser Val Trp Asn Arg 
3295 3300 3305 

GTT TGG ATA GAG GAA AAC CCA TGG ATG GAG GAC AAA ACC CAT GTA TCC 10046 
Val Trp He Glu Glu Asn Pro Trp Met Glu Asp Lys Thr His Val Ser 
3310 3315 3320 

AGT TGG GAA GAT GTT CCA TAT TTA GGA AAA AGG GAA GAT CAG TGG TGT 10094 
Ser Trp Glu Asp Val Pro Tyr Leu Gly Lys Arg Glu Asp Gin Trp Cys 
3325 3330 3335 

GGA TCC CTG ATA GGC TTA ACA GCA AGG GCT ACC TGG GCC ACC AAC ATA 10142 
Gly Ser Leu He Gly Leu Thr Ala Arg Ala Thr Trp Ala Thr Asn He 
3340 3345 3350 

CAA GTG GCC ATA AAC CAA GTG AGA AGA * CTA ATC GGG AAT GAG AAT TAT 10190 
Gin Val Ala He Asn Gin Val Arg Arg Leu He Gly Asn Glu Asn Tyr 
3355 3360 3365 3370 
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CTA GAT TAC ATG ACA TCA ATG AAG AGA TTC AAG AAC GAG AGT GAT CCG 10238 
Leu Asp Tyr Met Thr Ser Met- Lys Arg Phe Lys Asn Glu Ser Asp Pro 
3375 3380 3385 

AAG GGG CAC TCT GGT GAG TCA ACA CAC TTA TGAAAATAAA GGAAAATAAG 10288 
Lys Gly His Ser Gly Glu Ser Thr His Leu 
3390 3395 

AAATCAAACA AGGCAAGAAG TCAGGCCGGA TTAAGCCATA GTACGGTAAG AGCTATGCTG 10348 

CCTGTGAGCC CCGTCCAAGG ACGTAAAATG AAGTCAGGCC GAAAGCCACG GTTTGAGCAA 10408 

ACCGTGCTGC CTGTAGCTTC ATCGTGGGGA TGTAAAAACC TGGGAGGCTG CAACCCATGG 10468 

AAGCTGTACG CATGGGGTAG CAGACTAGTG GTTAGAGGAG ACCCCTCCCA AAACATAACG 1Q528 

CAGCAGCGGG GCCCAACACC AGGGGAAGCT GTATCCTGGT GGTAAGGACT AGAGGTTAGA 10588 

GGAGACCCCC GGCATAACAA TAAACAGCAT ATTGACGCTG GGAGAGACCA GAGATCCTGC 10648 

TGTCTCTACA GCATCATTCC AGGCACAGAA CGCCAGAAAA TGGAATGGTG CTGTTGAATC 10708 

AACAGGTTCT 10718 

(2) INFORMATION FOR SEQ ID NOs2t 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3396 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGy: linear 

(ii) MOLECULE TYPE: protein 

Cxi) SEQUENCE DESCRIPTION:. SEQ ID NO:2: 

Met Asn Asn Gin Arg Lys Lys Thr Ala Arg Pro Ser Phe Asn Met Leu 
1 5 10 15 

Lys Arg Ala Arg Asn Arg Val Ser Thr Gly Ser Gin Leu Ala Lys Arg 
20 25 30 

Phe Ser Lys Gly Leu Leu Ser Gly Gin Gly Pro Met Lys Leu Val Met 
35 40 45 

Ala Phe lie Ala Phe Leu Arg Phe Leu Ala He Pro Pro Thr Ala Gly 
50 55 60 

He Leu . Ala Arg Trp Gly Ser Phe Lys Lys Asn Gly Ala He Lys Val 
65 . 70 75 80 

Leu Arg Gly Phe Lys Lys Glu lie Ser Asn Met Leu Asn He Met Asn 
8S 90 95 

Arg Arg Lys Arg Ser Val Thr Met Leu Leu Met Leu Leu Pro Thr Ala 
100 105 110 

Leu Ala Phe His Leu Thr Thr Arg Gly Gly Glu Pro His Met He Val 
115 120 125 

Ser Lys Gla Glu Arg Glu Lys Ser Leu Leu Phe Lys Thr Ser Val Gly 
130 135 140 
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Val Asn Met Cys Thr Leu He Ala Met Asp Leu Gly Glu Leu Cys Glu 
145 150 155 160 

Asp Thr Met Thr Tyr Lys Cys Pro Arg He Thr Glu Ala Glu Pro Asp 
165 • 170 175 

Asp Val Asp Cys Trp Cys Asn Ala Thr Asp Thr Trp Val Thr Tyr Gly 
180 185 * 190 

Thr Cys Ser Gin thr Gly Glu His Arg Arg Asp Lys Arg Ser Val Ala 
195 200 205 

Leu Ala Pro His Val Gly Leu Gly Leu Glu Thr Arg Thr Glu Thr Trp 
210 215 . 220 

Met Ser Ser Glu Gly Ala Trp Lys Gin He Gin Arg Val Glu Thr Trp 
225 • 230 235 " 240 

Ala Leu Arg His Pro Gly Phe Thr Val He Ala Leu Phe Leu Ala His 
245 250 255 

Ala He Gly Thr Ser He Thr Gin Lys Gly He He Phe He Leu Leu 
.260 265 270 

Met Leu Val Thr Pro Ser Met Ala Met Arg Cys Val Gly He Gly Ser 
275 280 285 

Arg Asp Phe Val Glu Gly Leu Ser Gly Ala Thr Trp Val Asp Val Val 
290 295 300 

Leu Glu His Gly Ser Cys Val Thr Thr Met Ala Lys Asp Lys Pro Thr 
305 310 315 * 320 

Leu Asp He Glu Leu Leu Lys Thr Glu Val Thr Asn Pro Ala Val Leu 
325 330 335 

Arg Lys Leu Cys He Glu Ala Lys lie Ser Asn Thr Thr Thr Asp Ser 
340 345 350 

Arg Cys Pro Thr Gin Gly Glu Ala Thr Leu Val Glu Glu Gin Asp Ala 
355 360 365 

Asn Phe Val Cys Arg Arg Thr Phe Val Asp Arg Gly Trp Gly Asn Gly 
370 375 " 380 

Cys Gly Leu Phe Gly Lys Gly Ser, Leu Leu Thr Cys Ala Lys Phe Lys 
385 390 395 400 

Cys Val Thr Lys Leu Glu Gly Lys He Val Gin Tyr Glu Asn Leu Lys 
405 410 * 415 

Tyr Ser Val He Val Thr Val His Thr Gly Asp Gin His Gin Val Gly 
420 425 430 

Asn Glu Thr Thr Glu His Gly Thr He Ala Thr He Thr Pro Gin Ala 
435 440 445 

Pro Thr Ser Glu He Gin Leu Thr Asp Tyr Gly Ala Leu Thr Leu Asp 
450 455 460 

Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn Glu Met Val Leu Leu Thr 
465 470 475 480 
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Met Lys Glu Lys Ser Trp Leu Val His Lys Gin Trp Phe Leu Asp Leu 
485 490 * 495 

Pro Leu Pro Trp Thr Ser Gly A La Ser Thr Ser Gin Glu Thr Trp Asn 
500 505 510 

Arg Gin Asp Leu Leu Val Thr Phe Lys Thr Ala His Ala Lys Lys Gin 
515 520 525 

Glu Val Val Val Leu Gly Ser Gin Glu Gly Ala Met His Thr Ala Leu 
530 535 540 

Thr Gly Ala Thr Glu lie Gin Thr Ser Gly Thr Thr Thr He Phe Ala 
545 550 555 560 

Gly His Leu Lys Cys Ar.g Leu Lys Met Asp Lys Leu Thr Leu Lys Gly 
565 570 575 

Met Ser Tyr Val Met Cys Thr Gly Ser Phe Lys Leu Glu Lys Glu Val 
'580 585 590 

Ala Glu Thr Gin His Gly Thr Val Leu Val Gin Val Lys Tyr Glu Gly 
595 600 60S 

Thr Asp Ala Pro Cys Lys lie Pro Phe Set Thr Gin Asp Glu T.v 9 Glv 
610 615 620 

Val Thr Gin Asn Arg Leu lie Thr Ala Asn Pro He Val Thr Asp Lys 
625 630 635 640 

Glu Lys Pro Val Asn He Glu Thr Glu Pro Pro Phe Gly Glu Ser Tyr 
645 650 655 

He Val Val Gly Ala Gly Glu Lys Ala Leu Lys Gin Cys Trp Phe Lys 
660 665 670 

Lys Gly Ser Ser He Gly Lys Met Phe Glu Ala Thr Ala Arg Gly Ala 
675 680 685 

Arg Arg Met Ala He Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser He 
690 695 . 700 

Gly Gly Val Phe Thr Ser Val Gly Lys Leu Val His Gin Val Phe Gly 
705 710 715 720 

Thr Ala Tyr Gly Val Leu Phe Ser Gly Val Ser Trp Thr Met Lys He 
725 , 730 * 735 

Gly He Gly He Leu Leu Thr Trp Leu Gly Leu Asn Ser Arg Ser Thr 
740 745 . 750 

Ser Leu Ser Met Thr Cys He Ala Val Gly Met Val Thr Leu Tyr Leu 
755 760 765 

Gly Val Met Val Gin Ala Asp Ser Gly Cys Val He Asn Trp Lys Gly 
770 775 780 

Arg Glu Leu Lys Cys Gly Ser Gly He Phe Val Thr Asn Glu Val His 
785 790 795 800 

Thr Trp Thr Glu Gin Tyr Lys Phe Gin Ala Asp Ser Pro Lys Arg Leu 
805 810 815 
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Ser 



Ala 



Ala He 
820 



Gly 



Lys 



Ala 



Trp 



Glu Glu Gly Val 
825 



Cys 



Gly He 
830 



Arg 



Ser Ala Thr Arg Leu Glu Asn He Met Trp Lys Gin He Ser Asn Glu 
835 840 845 

Leu Asn His He Leu Leu Glu Asn Asp Met Lys Phe Thr Val Val Val 
850 855 860 

Gly Asp Val Val Gly lie Leu Ala Gin Gly Lys Lys Met He Arg Pro 
865 870 875 880 

Gin Pro Met Glu His Lys Tyr Ser Trp Lys Ser Trp Gly Lys Ala Lys 
885 890 895 

lie He Gly Ala Asp He. Gin Asn Thr Thr Phe He He Asp Gly Pro 
900 905 910 

Asp Thr Pro Glu Cys Pro Asp Asp Gin Arg Ala Trp Asn He Trp Glu 
915 920 925 

Val Glu Asp Tyr Gly Phe Gly He Phe Thr Thr Asn He Trp Leu Lys 
930 935 940 

Leu Arg Asp Ser Tyr Thr Gin Met Cys Asp His Arg Leu Met Ser Ala 
945 950 955 960 

Ala He Lys Asp Ser Lys Ala Val His Ala Asp Met Gly Tyr Trp He 
965 970 975 

Glu Ser Glu Lys Asn Glu Thr Trp Lys Leu Ala Arg Ala Ser Phe He 
980 985 990 

Glu Val LyB Thr Cys Val Trp Pro Lys Ser His Thr Leu Trp Ser Asn 
995 1000 1005 

Gly Val Leu Glu Ser Glu Met He lie Pro Lys He Tyr Gly Gly Pro 
1010 1015 1020 

He Ser Gin His Asn Tyr Arg Pro Gly Tyr Phe Thr Gin Thr Ala Gly 
1025 1030 1035 104< 

Pro Trp His Leu Gly Lys Leu Glu Leu Asp Phe Asp Leu Cys Glu Gly 
1045 1050 ^ 1055 

Thr Thr Val Val Val Asp Glu His Cys Gly Asn Arg Gly Pro Ser Leu 
1060 1065 1070 

Arg Thr Thr Thr Val Thr Gly Lys He He His Glu Trp Cys Cys Arg 
1075 1080 1085 

Ser Cys Thr Leu Pro Pro Leu Arg Phe Lys Gly Glu Asp Gly Cys Trp 
1090 1095 * 1100 

Tyr Gly Met Glu He Arg Pro Val Lys Glu Lys Glu Glu Asn Leu Val 
1105 1110 H15 112\ 

Lys Ser Met Val Ser Ala Gly Ser Gly Glu Val Asp Ser Phe Ser Leu 
1125 1130 1135 

Gly Leu Leu Cys He Ser He Met He Glu Glu Val Met Arg Ser Arg 



1140 



1145 



1150 
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Trp Ser Arg Lys Met Leu Met Thr Gly Thr Leu Ala Val Phe Leu Leu 
1155 1160 1165 

Leu He Met Gly Gin Leu Thr Trp Asn Asp Leu lie Arg Leu Cys He 
1170 1175 1180 

Met Val Gly Ala Asa Ala Ser Asp Arg Met Gly Met Gly Thr Thr Tyr 
1185 1190 1195 1200 

Leu Ala Leu Met Ala Thr Phe Lys Met Arg Pro Met Phe Ala Val Gly 
1205 1210 1215 

Leu Leu Phe Arg Arg Leu Thr Ser Arg Glu Val Leu Leu Leu Thr He 
1220 1225 1230 

Gly Leu Ser Leu Val Ala Ser Val Glu Leu Pro Asn Ser Leu Glu Glu 
1235 1240 1245 

Leu Gly Asp Gly Leu Ala Met Gly He Met He Leu Lys Leu Leu Thr 
1250 1255 1260 

Asp Phe Gin Ser His Gin Leu Trp Ala Thr Leu Leu Ser Leu Thr Phe 
1265 1270 1275 1280 

Val Lys Thr Thr Phe Ser Leu His Tyr Ala Trp Lys Thr Met Aia Met 
1285 1290 1295 

Val Leu Ser lie Val Ser Leu Phe Pro Leu Cys Leu Ser Thr Thr Ser 
1300 1305 1310 

Gin Lys Thr Thr Trp Leu Pro Val Leu Leu Gly Ser Leu Gly Cys Lys 
1315 1320 1325. 

Pro Leu Thr Met Phe Leu He Ala Glu Asn Lys He Trp Gly Arg Lys 
1330 1335 1340 



Ser Trp Pro Leu Asn Glu Gly He Met Ala Val Gly He Val Ser He 
1345 1350 1355 1360 

Leu Leu Ser Ser Leu Leu Lys Asn Asp Val Pro Leu Ala Gly Pro Leu 
1365 1370 1375 

He Ala Gly Gly Met Leu He Ala Cys Tyr Val He Ser Gly Ser Ser 
1380 1385 1390 

Ala Asp Leu Ser Leu Glu LyS Ala Ala Glu Val Ser Trp Glu Glu Glu 
1395 1400 1405 

Ala Glu His Ser Gly Ala Ser His Asn He Leu Val Glu Val Gin Asp 
1410 1415 1420 

Asp Gly Thr Met Lys He Lys Asp Glu Glu Arg Asp Asp Thr Leu Thr 
1425 1430 1435 1440 

He Leu Leu Lys Ala Thr Leu Leu Ala Val Ser Gly Val Tyr Pro Leu 
1445 1450 1455 

Ser He Pro Ala Thr Leu phe Val Trp Tyr Phe Trp Gin Lys Lys Lys 
1460 1465 1470 

Gin Arg Ser Gly Val Leu Trp Asp Thr Pro Ser Pro Pro Glu Val Glu 
1475 1480 1485 
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Arg Ala Val Leu Asp Asp Gly He Tyr Arg He Met Gin Arg Gly Leu 
1490 1495 1500 

Leu Gly Arg Ser Gin Val Gly Val Gly Val Phe Gin Asp Gly Val Phe 
1505 1510 1515 1520 

His Thr Met Trp His Val Thr Arg Gly Ala Val Leu Met Tyr Gin Gly 
1525 1530 1535 

Lys Arg Leu Glu Pro Ser Trp Ala Ser Val Lys Lys Asp Leu He Ser 
1540 1545 1550 

Tyr Gly Gly Gly Trp Arg Phe Gin Gly Ser Trp Asn Thr Gly Glu Glu 
1555 1560 1565 

Val Gin Val He Ala Val Glu Pro Gly Lys Asn Pro Lys Asn Val Gin 
1570 1575 1580 

Thr Ala Pro Gly Thr Phe Lys Thr Pro Glu Gly Glu Val Gly Ala He 
1585 1590 1595 1600 

Ala Leu Asp Phe Lys Pro Gly Thr Ser Gly Ser Pro He Val Asn Arg 
1605 1610 1615 

Glu Gly Lys He Val Gly Leu Tyr Gly Asn Gly Val Val Thr Thr Ser 
1620 1625 1630 

Gly Thr Tyr Val Ser Ala He Ala Gin Ala Lys 4la Ser Gin Glu Gly 
1635 1640 \J, 1645 

Pro Leu Pro Glu He Glu Asp Glu Val Phe Arg Lys Arg Asn Leu Thr 
1650 1655 1660 

He Met Asp Leu His Pro Gly Ser Gly Lys Thr Arg Arg Tyr Leu Pro 
1665 1670 1675 1680 

Ala He Val Arg Glu Ala He Arg Arg Asn Val Arg Thr Leu He Leu 
1685 1690 1695 

Ala Pro Thr Arg Val Val Ala Ser Glu Met Ala Glu Ala Leu Lys Gly 
1700 1705 1710 

Met Pro lie Arg Tyr Gin Thr Thr Ala Val Lys Ser Glu His Thr Gly 
1715 1720 ' 1725 

Lys Glu He Val Asp Leu Met Cys His Ala Thr Phe Thr Met Arg Leu 
1730 1735 1740 

Leu Ser Pro Val Arg Val Pro Asn Tyr Asn Met He He Met Asp Glu 
1745 1750 1755 1760 

Ala His Phe Thr Asp Pro Ala Ser He Ala Arg Arg Gly Tyr He Ser 
1765 1770 1775 

Thr Arg Val Gly Met Gly Glu Ala Ala Ala He Phe Met Thr Ala Thr 
1780 1785 1790 

Pro Pro Gly Ser Val Glu Ala Phe Pro Gin Ser Asn Ala Val He Gin 
1795 1800 1805 

Asp Glu Glu Arg Asp He Pro Glu Arg Ser Trp Asn Ser Gly Tyr Glu 
1810 1815' " 1820 
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Trp lie Thr Asp Phe Pro Gly Lys Thr Val Trp Phe Val Pro Ser lie 
1825 1830 .1835 1840 

Lys Ser Gly Asn Asp lie Ala Asn Cys Leu Arg Lys Asn Gly Lys Arg 
1845 1850 1855 

Val lie Gin Leu Ser Arg Lys Thr Phe Asp Thr Glu Tyr Gin I*ys Thr 
i860 1865 1870 

Lys Asn Asn Asp Trp Asp Tyr Val Val Thr Thr Asp He Ser Glu Met 
1875 1880 1885 

Gly Ala Asn Phe Arg Ala Asp Arg Val lie Asp Pro Arg Arg Cys Leu 
1890 1895 1900 

Lys Pro Val He Leu Lys Asp Gly Pro Glu Arg Val He Leu Ala Gly 
1905 1910 * 1915 1920 

Pro Met Pro Val Thr Val Aid Ser Ala Ala Gin Arg Arg Gly Arg He 
1925 1930 1935. 

Gly Arg Asn Gin Asn Lys Glu Gly Asp Gin Tyr Val Tyr Met Gly Gin 
1940 1945 1950 

Pre Leu Asn Asn Asp Glu Asp His Ala His Trp Thr Glu Ala Lys Mst 
1955 * I960 1965 

Leu Leu Asp Asn He Asn Thr Pro Glu Gly He lie Pro Ala Leu Phe 
1970 1975 1980 

Glu Pro Glu Arg Glu Lys Ser Ala Ala He Asp Gly Glu Tyr Arg Leu 
1985 1990 1995 2000 

Arg Gly Glu Ala Arg Lys Thr Phe Val Glu Leu Met Arg Arg Gly Asp 
2005 . 2010 2015 

Leu Pro Val Trp Leu Ser Tyr Lys Val Ala Ser Glu Gly Phe Gin Tyr 
2020 2025 2030 

Ser Asp Arg Arg Trp Cys Phe Asp Gly Glu Arg Asn Asn Gin Val Leu 
2035 2040 2045 

Glu Glu Asn Met Asp Val Glu Met Trp Thr Lys Glu Gly Glu Arg Lys 
2050 2055 2060 

Lys Leu Arg Pro Arg Trp Leu Asp Ala Arg Thr Tyr Ser Asp Pro Leu 
2065 2070 . 2075 * 2080 

Ala Leu Arg Glu Phe Lys Glu Phe Ala Ala Gly Arg Arg Ser Val Ser 
2085 2090 ' 2095 

Gly Asp Leu He Leu Glu He Gly Lys Leu Pro Gin His Leu Thr Gin 
2100 2105 2110 

Arg Ala Gin Asn Ala Leu Asp Asn Leu Val Met Leu His Asn Ser Glu 
2115 2120 2125 . 

Gin Gly Gly Arg Ala Tyr Arg His Ala Met Glu Glu Leu Pro Asp Thr 
2130 2135 2140 

He Glu Thr Leu Met Leu Leu Ala Leu lie Ala Val Leu Thr Gly Glv 
2145 2150 2155 2160 
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Val Thr Leu Phe Phe Leu Ser Gly Lys Gly Leu Gly Lys Thr Ser lie 
2165 2170 2175 

Gly Leu Leu Cys Val Met Ala Ser Ser Val Leu Leu Trp Met Ala Ser 
2180 2185 2190 

Val Glu Pro His Trp He Ala Ala Ser He He Leu Glu Phe Phe Leu 
2195 2200 2205 

Met Val Leu Leu He Pro Glu Pro Asp Arg Gin Arg Thr Pro Gin Asp 
2210 2215 2220 

Asn Gin Leu Ala Tyr Val Val He Gly Leu Leu Phe Met He Leu Thr 
2225 2230 2235 2240 

Val Ala Ala Asn Glu Met Gly Leu Leu Glu Thr Thr Lys Lys Asp Leu 
2245 2250 " 2255 

Gly He Gly His Val Ala Ala Glu Asn His His His Ala Thr Met Leu 
2260 2265 2270 

Asp Val Asp Leu Arg Pro Ala Ser Ala Trp Thr Leu Tyr Ala Val Ala 
2275 2280 2285 

Thr Thr val lie Thr Pro Met Met Arg Kia Thr lie Glu Asn Thr Thr 
2290 2295 2300 

Ala Asn He Ser Leu Thr Ala He Ala Asn Gin Ala Ala He Leu Met 
2305 2310 2315 2320 

Gly Leu Asp Lys Gly Trp Pro He Ser Lys Met Asp He Gly Val Pro 
2325 2330 2335 

Leu Leu Ala Leu Gly Cys Tyr Ser Gin Val Asn Pro Leu Thr Leu Thr 
2340 2345 2350 

Ala Ala Val Leu Met Leu Val Ala His Tyr Ala He He Gly Pro Gly 
2355 2360 2365 

Leu Gin Ala Lys Ala Thr Arg Glu Ala Gin Lys Arg Thr Ala Ala Gly 
2370 2375 2380 

He Met Lys Asn Pro Thr Val Asp Gly He Val Ala He Asp Leu Asp 
2385 2390 2395 2400 

Pro Val Val Tyr Asp Ala Lys Phe Glu Lys Gin Leu Gly Gin He Met 
2405 2410 2415 

Leu Leu He Leu Cys Thr Ser Gin He Leu Leu Met Arg Thr Thr Trp 
2420 2425 2430 

Ala Leu Cys Glu Ser He Thr Leu Ala Thr Gly Pro Leu Thr Thr Leu 
2435 2440 ' 2445 

Trp Glu Gly Ser Pro Gly Lys Phe Trp Asn Thr Thr He Ala Val Ser 
2450 2455 2460 

Met Ala Asn He Phe Arg Gly Ser Tyr Leu Ala Gly Ala Gly Leu Ala 
2465 2470 2475 " 2480 

Phe Ser Leu Met Lys Ser Leu Gly Gly Gly Arg Arg Gly Thr Gly Ala 
2485 2490 2495 
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Lys Gly Lys His Trp Glu Arg Asn Gly Lys Asp Arg Leu Asn Gin Leu 
2500 2505 2510 

Ser Lys Ser Glu Phe Asn Thr Tyr Lys Arg Ser Gly lie Met Glu Val 
2515 . . . . 2520 2525 

Asp Arg Ser Glu Ala Lys Glu Gly Leu Lys Arg Gly Glu Thr Thr Lvs 
2530 2535 2540 

His Ala Val Ser Arg Gly Thr Ala Lys Leu Arg Trp Phe Val Glu Arg 
2545 .2550 2555 2560 

Asn Leu Val Lys Pro Glu Gly Lys Val lie Asp Leu Gly Cys Gly Arc 
2565 2570 2575 

Gly Gly Trp Ser Tyr Tyr Cys Ala Gly Leu Lys Lys Val Thr Glu Val 
2580 2585 * 2590 

Lys Gly Tyr Thr Lys Gly Gly Pro Gly His Glu Glu Pro lie Pro Met 
2595 2600 2605 

Ala Thr Tyr Gly Trp Asn Leu Val Lys Leu Tyr Ser Gly Lys Asp Val 
2610 2615 2620 

Phe Phe Thr Pro Pro Glu Lys Cys Asp Thr Leu Leu Cys Asp He Gly 
2625 2630 2635 2640 

Glu Ser Ser Pro Asn Pro Thr lie Glu Glu Gly Arg Thr Leu Arg Val 
2645 2650 . 2655 

Leu Lys Met Val Glu Pro Trp Leu Arg Gly Asn Gin Phe Cys He Lys 
2660 2665 2670 

He Leu Asn Pro Tyr Met Pro Ser Val Val Glu Thr Leu Glu Gin Met 
2675 2680 2685 

Gin Arg Lys His Gly Gly Met Leu Val Arg Asn Pro Leu Ser Arg Asn 
2690 2695 2700 

Ser Thr His Glu Met Tyr Trp Val Ser Cys Gly Thr Gly Asn He Val * 
2705 2710 2715 2720 

Ser Ala Val Asn Met Thr Ser Arg Met Leu Leu Asn Arg Phe Thr Met 
2725 2730 . 2735 

Ala His Arg Lys Pro Thr Tyr Glu Arg Asp Val Asp Leu Gly Ala Gly 
2740 2745 2750 

Thr Arg His Val Ala Val Glu Pro Glu Val Ala Asn Leu Asp He He 
2755 2760 2765 

Gly Gin Arg He Glu Asn lie Lys His Glu His Lys Ser Thr Trp His 
277Q 2775 2780 

Tyr Asp Glu Asp Asn Pro Tyr Lys Thr Trp Ala Tyr His Gly Ser Tyr 
2785. 2790 2795 2800 

Glu Val Lys Pro Ser Gly Ser Ala Ser Ser Met Val Asn Gly Vai Val 
2805 2810 2815 

Lys Leu Leu Thr Lys ^ Pro Trp Asp Ala He Pro Met Val Thr Gin He 
2820 2825 2830 
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Ala Met Thr Asp Thr Thr Pro Phe Gly Gin Gin Arg Val Phe Lys Glu 
2835 2840 2845 

Lys Val Asp Thr Arg Thr Pro Lys Ala Lys Arg Gly Thr Ala Gin lie 
2850 2855 2860 

Met Glu Val Thr Ala Arg Trp Leu Trp Gly Phe Leu Ser Arg Asn Lys 
2865 2870 2875 2880 

Lys Pro Arg lie Cys Thr Arg Glu Glu Phe Thr Arg Lys Val Arg Ser 
2885 2890 * 2895 

Asn Ala Ala lie Gly Ala Val Phe Val Asp Glu Asn Gin Trp Asn Ser 
2900 2905 2910 

Ala Lys Glu Ala Val Glu Asp Glu Arg Phe Trp Asp Leu Val His Arg 
2915 2920 2925 

Glu Arg Glu Leu His Lys Gin Gly Lys Cys Ala Thr Cys Val Tyr Asn 
2930 2935 2940 

Met Met Gly Lys Arg Glu Lys Lys Leu Gly Glu Phe Gly Lys Ala Lys 
2945 2950 ,* 2955 2960 

Gly Ssr Arg Ala lis Trp Tyr Mst Trp Lgu Gly Ala Arg Phc Leu Glu 
2965 2970 2975 

Phe Glu Ala Leu Gly Phe Met Asn Glu Asp His Trp Phe Ser Arg Glu 
2980 2985 2990 

Asn Ser Leu Ser Gly Val Glu Gly Glu Gly Leu His Lys Leu Gly Tyr 
2995 3000 3005 

lie Leu Arg Asp lie Ser Lys lie Pro Gly Gly Asn Met Tyr Ala Asp 
3010 3015 3020 

Asp Thr Ala Gly Trp Asp Thr Arg He Thr Glu Asp Asp Leu Gin Asn 
3025 3030 3035 3040 

Glu Ala Lys He Thr Asp He Met Glu Pro Glu His Ala Leu Leu Ala 
3045 3050 3055 

Thr Ser He Phe Lys Leu Thr Tyr Gin Asn Lys Val Val Arg Val Gin 
3060 3Q65 3070 

Arg Pro Ala Lys Asn. Gly Thr Val Met Asp Val He Ser Arg Arg Asp 
3075 3080 3085 

Gin Arg Gly Ser Gly Gin Val Gly Thr Tyr Gly Leu Asn Thr Phe Thr 
3090 3095 3100 

Asn Met Glu Ala Gin Leu He Arg Gin Met Glu Ser Glu Gly He Phe 
3105 3110 3115 ^ 3120 

Ser Pro Ser Glu Leu Glu Thr Pro Asn Leu Ala Glu Arg Val Leu Asp 
3125 3130 3135 

Trp Leu Glu Lys Tyr Gly Val Glu Arg Leu Lys Arg Met Ala He Ser 
3140 3145 * 3150 

Gly Asp Asp Cys Val Val Lys Pro lie Asp Asp Arg Phe Ala Thr Ala 
3155 3160 3165 
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Leu Thr Ala Leu Asn Asp Met Gly Lys Val Arg Lys Asp. He Pro Gin 
3170 3175 3180 

Trp Glu Pro Ser Lys Gly Trp Asn Asp Trp Gin Gin Val Pro Phe Cys 
3185 3190 3195 3200 - 'i ' J 

Ser His His Phe His Gin Leu He Met Lys Asp Gly Arg Glu He Val 
3205 3210 3215 

Val Pro Cys Arg Asn Gin Asp Glu Leu Val Gly Arg Ala Arg Val Ser 
3220 3225 3230 

Gin Gly Ala Gly Trp Ser Leu Arg Glu Thr Ala Cys Leu Gly Lys Ser 
3235 3240 3245 

Tyr Ala Gin Met Trp Gin Leu Met Tyr Phe His Arg Arg Asp Leu Arg 
3250 3255 3260 

Leu Ala Ala Asn Ala He Cys Ser Ala Vai Pro Val Asp Trp Val Pro 
3265 3270 3275 3280 

Thr Ser Arg Thr Thr Trp Ser lie His Ala His His Gin Trp Met Thr 
3285 3290 3295 

Thr Glu Asp Met Leu Ser Val Trp Asn Arg Val Trp lie Glu Glu Asn 
3300 3305 3310 

Pro Trp Met Glu Asp Lys Thr His Val Ser Ser. Trp Glu Asp Val Pro 
3315 3320 3325 

Tyr Leu Gly Lys Arg Glu Asp Gin Trp Cys Gly Ser Leu lie Gly Leu 
3330 3335 . 3340 

Thr Ala Arg Ala Thr Trp Ala Thr Asn He Gin Val Ala He Asn Gin 
3345 3350. 3355 3360 

Val Arg Arg Leu He Gly Asn Glu Asn Tyr Leu Asp Tyr Met Thr Ser % 
3365 3370 3375 

Met Lys Arg Phe Lys Asn Glu Ser Asp Pro Lys Gly His Ser Gly Glu 
3380 3385 3390 

Ser Thr His Leu 
3395 



(2) INFORMATION FOR SEQ ID NO: 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) f 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCATGAATTC CCATGCGATG CGTGGGA 27 
(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
CACATCTCGA GTCCGCTTGA ACCATGA 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) . 



(xi) SEQUENCE DESCRIPTION: SEQ ID NC:£ 
TGGTTCCCGG GGACTCGGGA TGTGTA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
ACTAAGCTTG ATCATGCAGA GACCATTGA 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
AATCAGAATT CTCTGCAGGG TCAGGGGAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs* 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION t SEQ ID NO: 8: 
ATAACAAAGC TTATCTTTGT TTCTTTTTCT 
(2) INFORMATION FOR SEQ ID- NO:9: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH? 31 base pairs 

(B) TYPE? nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) . 



(xl) SEQUENCE DESCRIPTION; SEQ ID NO: 9: 
GAAAG6ATCC TCTG6A6TGT TATGGGACAC A 
(2} INFORMATION FOR SEQ ID NO; 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCCAAGCTT CATCTTCTTC CTGCTGC 

(2) INFORMATION FOR SEQ ID NO; 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 11; 
AGGAGGTCGA CGAGGTACGG GAGCC 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 
CAATGATATC TAGGTTGGCT 
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CLAIMS 

1. DEN1-S275/90 (ECACC V92042111) 

2. DEN1-S275/90 (ECACC V92042111) in inactivated 

5 form. 

3. A DNA polynucleotide encoding DEN1-S275/90 
(ECACC V92042111) whose sequence is substantially as shown 
in Seq. ID No. 1 

4. A fragment of a DNA polynucleotide as claimed in 
10 claim 3, said fragment encoding the C, C, PreM, M, E, NS1, 

NS2A, NS2B, NS3, NS4A, NS4B or NS5 gene of DEN1-S275/90 
(ECACC V92042111) . 

5. A DNA polynucleotide or a fragment thereof 
according to claim 3 or claim 4 in an expression vector. 

15 6. An expression vector as claimed in claim 5 

selected from pGEX-KG/EX-20, pMAL-c/NSl-104 , pMAL-cRI/NS2- 
1, pGEX-KG/NS3 BH C600-1 and pGEX-KG/NS5 c600 HF1. 

7. A cell harbouring an expression vector as claimed 
in claim 5 or claim 6. 

20 8. A cell as claimed in claim. 7 which is E.coli or 

an insect cell. 

9. A polypeptide in substantially isolated form 
Which is the C, C, PreM, M, E,NS1, NS2A, NS2B, NS3, NS4A, 
NS4B or NS5 polypeptide of DEN1-S275/90 (ECACC V92042111 ) . 

25 10. A polypeptide as claimed in claim 9 which is in 

the form of a fusion protein. 

11. A fusion protein as claimed in claim 10 which is 
coded by an expression vector selected from the expression 
vectors of claim 6. 

30 12. A method of preparing a polypeptide as claimed in 

any one of claims 9 to 11 which comprises culturing a cell 
line according to claim 7 or claim 8 and recovering the 
polypeptide. 

13. A polypeptide as claimed in claim 9 carrying a 
35 label. 
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14 . A vaccine comprising .one or more polypeptides as 
claimed in any one of claims 9 to 11 or the inactivated 
virus as claimed in claim 2 in combination with a 
pharmaceutical^ acceptable carrier or diluent . 
5 15. The vaccine of claim 14 wherein one polypeptide 

is selected from E, NS1, NS2, NS3 , NS5 and fusion proteins 
thereof capable of eliciting antibodies to a DEN1 viral 
protein. 

16. An antibody against a polypeptide as claimed in 
10 any one of claims 9 to 11 capable of binding a DEN1 viral 
protein, optionally carrying a revealing label. 

17* A test kit for the detection of the presence or 
absence of DEN1 virus comprising the antibody of claim 16 
or the polypeptide of claim 9 or 13 fixed to a solid 
15 support. 
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FIGURE 1 
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FIGURE 2 
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FIGURE 3 



200- 
97-. 
68— 

43— 



123456789 10 11 12 



--NS5 
--NS3 

--NS2 
•"NS1 



29 — 



18— 
14— 



£3* 



WO 93/22440 



PCT/CA93/00182 




INTERNATIONAL SEARCH REPORT 



No 



PCT/CA 93/00182 



l CLASSIFICATION OP SUBJECT MATTER Of nvwaJ citsifiatloB syakoU tffty, tailau ill)> 



Patnt ChaUaOm <IPC) or to koth Ntttaai Clmtflcxtioo tut IPC 

Int.Gl. 5 G12N15/40; C12N7/00; C07K13/00; 

A61K39/12; C12P21/00; G01N33/50 



C12M15/62 



SardHO 7 



nwrtflfirtoB Sytab 



Int.Cl. 5 



C07K ; 



C12N 



m. DOCUMENTS CONSIDERED TO BE RELEVANT* 



" with Micttta, wtw* tapapbtt, of dM lotmst 



RrioviMtoClsiaNo.0 



VIROLOGY 

VVI. *QQ 9 UV. 4., AjJt-i 

pages 953 - 958 

FU.J. ET AL. 'Full-Length cDNA Sequence of 
Dengue Type 1 Virus (Singapore Strain 
S275/90)' 

see the whole document 
VIROLOGY 

vol, 174, no. 2, February 1990, 
pages 479 - 493 

RICO-HESSE ,R. 'Molecular Evolution and 
Distribution of Dengue Viruses Type 1 and 
2 in Nature 1 
see the whole document 



1-17 




m trinity *tft tM «* li amOa «tt *• tMttcabcatot 



ftyjitf W <totf mtiooal ***** **** bnt 




iv. aEJtnncAnoN 



Mi tf tkt Acini CmpMte of tM Itunastoul 

29 JUNE 1993 



DstiDfMtJU&ioflktsl 

02 -09- 1993 



BrtHArtwrity 

EUROPEAN PATENT OFFICE 



StfRRtm* «f Asftofitt* Officer 

CHAMBONNET F.J. 



Nra PCT/ISA/llO (mi UK) H— Ml |«3) 



No 



PCT/CA 93/00182 



JLDOCUMBTOCOIgHMlEDTOlEBgXVANT (powHWaPWOWTHESEOONPSHIEI) 



cava* 



VACCINES 80 1988, COLD SPRING HARBOR 
Maes 398 - 389 

LAL> C.-J. ET AL. 'Cloning FulT-Ungth ONA 
Sequences of the Dengue Virus Genone for 
■ use in Elucidating Pathogenesis and 
^Jevelopeent of Inmunoprophylaxis' 
Vee the whole docuaent 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

jZjBLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

JZfcOLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



